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1 PREFACE 


This book grew out of a project to publish source code for cryptographic software, namely PGP (Pretty Good Privacy), a software 
package for the encryption of electronic mail and computer files. PGP is the most widely used software in the world for email 
encryption. Pretty Good Privacy, Inc (or *PGP") has published the source code of PGP for peer review, a long- standing tradition 
in the history of PGP. The first time a fully implemented cryptographic software package was published in its entirety in book form 
was “PGP Source Code and Internals,” by Philip Zimmermann, published by The MIT Press, 1995, ISBN 0-262-24039-4. 

Peer review of the source code is important to get users to trust the software, since any weaknesses can be detected by knowledgeable 
experts who make the effort to review the code. But peer review cannot be completely effective unless the experts conducting the 
review can compile and test the software, and verify that it is the same as the software products that are published electronically. 
To facilitate that, PGP publishes its source code in printed form that can be scanned into a computer via OCR (optical character 
recognition) technology. 

Why not publish the source code in electronic form? As you may know, cryptographic software is subject to U.S. export control 
laws and regulations. The new 1997 Commerce Department Export Administration Regulations (EAR) explicitly provide that 
“A printed book or other printed material setting forth encryption source code is not itself subject to the EAR.” (see 15 C.F.R. 
734.3(b)(2)). PGP, in an overabundance of caution, has only made available its source code in a form that is not subject to those 
regulations. So, books containing cryptographic source code may be published, and after they are published they may be exported, 
but only while they are still in printed form. 

Electronic commerce on the Internet cannot fully be successful without strong cryptography. Cryptography is important for 
protecting our privacy, civil liberties, and the security of our personal and business transactions in the information age. The 
widespread deployment of strong cryptography can help us regain some of the privacy and security that we have lost due to 
information technology. Further, strong cryptography (in the form of PGP) has already proven itself to be a valuable tool for the 
protection of human rights in oppressive countries around the world, by keeping those governments from reading the communications 
of human rights workers. 

This book of tools contains no cryptographic software of any kind, nor does it call, connect, nor integrate in any way with 
cryptographic software. But it does contain tools that make it easy to publish source code in book form. And it makes it easy to 
scan such source code in with OCR software rapidly and accurately. 

Philip Zimmermann prz@mit.edu 

November 1997 


2 INTRODUCTION 


This book contains tools for printing computer source code on paper in human-readable form and reconstructing it exactly using 
automated tools. While standard OCR software can recover most of the graphic characters, non-printing characters like tabs, spaces, 
newlines and form feeds cause problems. 

In fact, these tools can print any ASCII text file; it's just that the attention these tools pay to spacing is particularly valuable for 
computer source code. The two-dimensional indentation structure of source code is very important to its comprehensibility. In some 
cases, distinctions between non-printing characters are critical: the standard make utility will not accept spaces where it expects to 
see a tab character. 

Producing a byte-for-byte identical copy of the original is also valuable for authentication, as you can verify a checksum. 

There are five problems we have addressed: 


Getting good OCR accuracy. 

Preserving whitespace. 

Preserving lines longer than can be printed on the page. 
Dealing with data that isn't human-readable. 

Detecting and correcting any residual errors. 


Cameo SEN 


The first problem is partly addressed by using a font designed for OCR purposes, OCR-B. OCR-A is a very ugly font that contains 
only the digits 0 through 9 and a few special punctuation symbols. OCR-B is a very readable monospaced font that contains a full 
ASCII set, and has been popular as a font on line printers for years because it distinguishes ambiguous characters and is clear even 
if fuzzy or distorted. 

Тће most unusual thing about the OCR-B font is the way that it prints a lower-case letter 1, with a small hook on the bottom, 
something like an upper-case L. This is to distinguish it from the numeral 1. We also made some modifications to the font, to print 
the numeral 0 with a slash, and to print the vertical bar in a broken form. Both of these are such common variants that they should 


not present any intelligibility barrier. Finally, we print the underscore character in a distinct manner that is hopefully not visually 
distracting, but is clearly distinguishable from the minus sign even in the absence of a baseline reference. 

The most significant part of getting good OCR accuracy is, however, using the OCR tools well. We’ve done a lot of testing and 
experimentation and present here a lot of information on what works and what doesn’t. 

To preserve whitespace, we added some special symbols to display spaces, tabs, and form feeds. A space is printed as a small 
triangular dot character, while a hollow rightward-pointing triangle (followed by blank spaces to the right tab stop) signifies a tab. 
A form feed is printed as a yen symbol, and the printed line is broken after the form feed. 

Making the dot triangular instead of square helps distinguish it from a period. To reduce the clutter on the page and make the 
text more readable, the space character is only printed as a small dot if it follows a blank on the page (a tab or another space), or 
comes immediately before the end of the line. Thus, the reader (human or software) must be able to distinguish one space from no 
spaces, but can find multiple spaces by counting the dots (and adding one). 

The format is designed so that 80 characters, plus checksums, can be printed on one line of an 8.5x11” (or A4) page, the still-common 
punched card line length. Longer lines are managed with the simple technique of appending a big ugly black blob to the first part of 
the line indicating that the next printed line should be concatenated with the current one with no intervening newline. Hopefully, 
its use is infrequent. 

While ASCII text is by far the most popular form, some source code is not readable in the usual way. It may be an audio clip, a 
graphic image bitmap, or something else that is manipulated with a specialized editing tool. For printing purposes, these tools just 
print any such files as a long string of gibberish in a 64-character set designed to be easy to OCR unambiguously. Although the 
tools recognize such binary data and apply extra consistency checks, that can be considered a separate step. 

Finally, the problem of residual errors arises. OCR software is not perfect, and uses a variety of heuristics and spelling-check 
dictionaries to clean up any residual errors in human-language text. This isn’t reliable enough for source code, so we have added 
per-page and per-line checksums to the printed material, and a series of tools to use those checksums to correct any remaining errors 
and convert the scanned text into a series of files again. 

This “munged” form is what you see in most of the body of this book. We think it does a good job of presenting source code in a 
way that can be read easily by both humans and computers. 

The tools are command-line oriented and a bit clunky. This has a purpose beyond laziness on the authors’ parts: it keeps them 
small. Keeping them small makes the “bootstrapping” part of scanning this book easier, since you don’t have the tools to help you 
with that. 


3 SCANNING 


Our tests were done with OmniPage 7.0 on a Power Macintosh 8500/120 and an HP ScanJet 4c scanner with an automatic document 
feeder. The first part of this is heavily OmniPage-specific, as that appears to be the most widely available OCR software. 

The tools here were developed under Linux, and should be generally portable to any Unix platform. Since this book is about 
printing and scanning source code, we assume the readers have enough programming background to know how to build a program 
from a Makefile, understand the hazards of CR, LF or CRLF line endings, and such minor details without explicit mention. 

The first step to getting OrnniPage 7 to work well is to set it up with options to disable all of its more advanced features for 
preserving font changes and formatting. Look in the Seffings menu. 


• Create a Zone Contents File with all of ASCII in it, plus the extra bullet, currency, yen and pilcrow symbols. Name it “Source 
Code”. 

e Create a Source Code style set. Within it, create a Source Code zone style and make it the default. 

• Set the font to something fixed-width, like Courier. 

• Set a fixed font size (10 point) and plain text, left-aligned. 

• Set the tab character to a space. 

• Set the text flow to hard line returns. 

• Set the margins to their widest. 

• The font mapping options are irrelevant. 


Go to the settings panel and: 


• Under Scanner, set the brightness to manual. With careful setting of the threshold, this generates much better results than 
either the automatic threshold or the 3D OCR. Around 144 has been a good setting for us; you may want to start there. 

e Under OCR, you'll build a training file to use later, but turn off automatic page orientation and select your Source Code style 
set in the Output Options. Also set a reasonable reject character. (For test, we used the pi symbol, which came across from 
the Macintosh as a weird sequence, but you can use anything as long as you make the appropriate definition in subst.c.) 


Do an initial scan of a few pages and create a manual zone encompassing all of the text. Leave some margin for page misalignment, 
and leave space on the sides for the left-right shift caused by the book binding being in different places on odd and even pages. 
Set the Zone Contents and the Style set to the Source Code settings. After setting the Style Set, the Zone Style should be 
automatically set correctly (since you set Source Code as the default). 

Тћеп save the Zone Template, and in the pop-up menu under the Zone step on the main toolbar you can now select it. 

Now we're ready to get characters recognized. The first results will be terrible, with lots of red (unrecognizable) and green (suspicious) 
text in the recognized window. Some tweaking will improve this enormously. 

The first step is setting a good black threshold. Auto brightness sets the threshold too low, making the character outlines bleed and 
picking up a lot of glitches on mostly-blank pages. Try training OCR on the few pages you've scanned and look at the representative 


characters. Adjust the threshold so the strokes are clear and distinct, neither so thin they are broken nor so think they smear into 
each other. The character that bleeds worst is lowercase w, while the underscore and tab symbols have the thinnest lines that need 
worry. 

You'll have to re-scan (you can just click the AUTO button) until you get satisfactory results. 

The next step is training. You should scan a significant number of pages and teach OmniPage about any characters it has difficulty 
with. There are several characters which have been printed in unusual ways which you must teach OmniPage about before it can 
recognize them reliably. We also have some characters that are unique, which the tools expect to be mapped to specific Latin-1 
characters to be processed. 

They characters most in need of training are as follows: 


е Zero is printed ‘slashed. 

e Lowercase L has a curled tail to distinguish it clearly from other vertical characters like 1 and I. 

• The or-bar or pipe symbol ‘|’ is printed “broken” with a gap in the middle to distinguish it similarly. 

e The underscore character has little “serifs” on the end to distinguish it from a minus sign. We also raised it a just a tad 
higher than the normal underscore character, which was too low in the character cell to be reliably seen by OmniPage. 

• Tabs are printed as a hollow right-pointing triangle, followed by blanks to the correct alignment position. If not trained 
enough, OmniPage guesses this is a capital D. You should train OmniPage to recognize this symbol as a currency symbol 
(Latin-1 244). 

• Any spaces in the original that follow a space, ог a blank on the printed page, are printed as a tiny black triangle. You 
should train OmniPage to recognize this as a center dot or bullet (Latin-1 267). We didn't use a standard center dot because 
OmniPage confused it with a period. 

• Any form feeds in the original are printed as a yen currency symbol (Latin-1 245). 

• Lines over 80 columns long are broken after 79 columns by appending a big ugly black block. You should train OmniPage to 
recognize this as a pilcrow (paragraph symbol, Latin-1 266). We did this because after deciding something black and visible 
was suitable, we found out the font we used doesn't have a pilcrow in it. 


Тће zero and the tab character, because of their frequency, deserve special attention. 

In addition, look for any unrecognized characters (in red) and retrain those pages. If you get an unrecognized character, that 
character needs training, but Caere says that “good examples” are best to train on, so if the training doesn't recognize a slightly 
fuzzy K, and there's a nice crisp K available to train on, use that. 

Other things that need training: 


• ~ (tilde), ^ (caret), * (backquote) and ' (quote). These get dropped frequently unless you train them. 

e і, j and; (semicolon). These get mixed up. 

• 3and 5. These also get mixed up. 

• Q can fail to be recognized. 

• C and [ can be confused. 

* c/C, 0/0, p/P, s/S, u/U, v/V, w/W, y/Y and 2/2 are often confused. This can be helped by some training. 
e r gets confused with c and n. I don't understand c, but it happens. 

• f gets confused with i. 


'The OCR training pages have lots of useful examples of troublesome characters. Scan a few pages of material, training each page, 
then scan a few dozen pages and look for recognition problems. Look for what OmniPage reports as troublesome, and when you 
have the repair program working, use it to find and report further errors. Train a few pages particularly dense in problems and 
append the troublesome characters to the training file, the re-recognize the lot. 

Double-check your training file for case errors. It's easy to miss the shift key in the middle of a lot of training and will result 
in terrible results even though OmniPage won't report anything amiss. We have spent a while wondering why OmniPage wasn't 
recognizing capital S or capital W, only to find that OmniPage was just doing what it was trained to do. 

We have heard some reports that OmniPage has problems with large training files. We have observed OmniPage suffering repeatable 
internal errors sometimes after massive training additions, but they were cured by deleting a few training images. Appending more 
training images to the training file did not cause the problem to re-appear. 

Repairing the OCR results 

If the only copy of the tools you have is printed in this book, see the next chapter on bootstrapping at this point. Here, we assume 
that you have the tools and they work. 

When you have some reasonable OCR results, delete any directory pages. With no checksum information, they just confuse the 
postprocessing tools. (The tools will just stop with an error when they get to the *uncorrectable" directory name and you'll have 
to delete it then, so it’s not fatal if you forget.) Copy the data to a machine that you have the repair and unmunge utilities on. 
Тће repair utility attempts automatic table-driven correction of common scanning errors. You have to recompile it to change the 
tables, but are encouraged to if you find a common problem that it does not correct reliably. If it gets stuck, it will deposit you 
into your favorite editor on or slightly after the offending line. (The file you will be editing is the unprocessed portion of the input.) 
After you correct the problem and quit the editor, repair will resume. 

“Your favorite editor" is taken from the $VISUAL and $EDITOR environment variables, or the -e option to repair. 

'he repair utility never alters the original input file. It will produce corrected output for file in file.out, and when it has to stop, 
it writes any remaining uncorrected input back out to file.in (via a temporary file.dump) and lets you edit this file. If you re-run 
repair on file and file.in exists, repair will restart from there, so you may safely quit and re-run repair as often as you like. (But if 
you change the input file, you need to delete the .in file for repair to notice the change.) 


Statistics on repair’s work are printed to file.log. This is an excellent place to look to see if any characters require more training. 
As it works, repair prints the line it is working on. If you see it make a mistake or get stuck, you can interrupt it (control-C or 
whatever is appropriate), and it will immediately drop into the editor. If you interrupt it a second time, it will exit rather than 
invoking the editor. If the editor returns a non-zero result code (fails), repair will also stop. (E.g. :са in vim.) 

One thing that repair fixes without the least trouble is the number of spaces expected after a printing tab character. It’s such an 
omnipresent OCR software error that repair doesn’t even log it as a correction. 

In some cases, repair can miscorrect a line and go on to the next line, possibly even more than once, finally giving up a few lines 
below the actual error. If you are having trouble spotting the error, one helpful trick is to exit the editor and let repair try to fix 
the page again, but interrupt it while it is still working on the first line, before it has found the miscorrection. 

The Nasty Lines 

Some lines of code, particularly those containing long runs of underscore or minus characters, are particularly difficult to scan 
reliably. The repair program has a special “nasty lines” feature to deal with this. If a file named “nastylines” (or as specified by 
the -І option) exists, they are checksummed and are considered as total replacements for any input line with the same checksum. 
So, for example, if you place a blank line in the nastylines file, any scanner noise on blank lines will be ignored. 

The “nastylines” file is re-read every time repair restarts after an edit, so you can add more lines as the program runs. (The 
error-correction patterns should be done this way, too, but that’ll have to wait for the next release.) 

Sortpages 

If, in the course of scanning, the pages have been split up or have gotten out of order, a perl script called sortpages can restore 
them to the proper order. It can merge multiple input files, discard duplicates, and warns about any missing pages it encounters. 
This script requires that the pages have been repaired, so that the page headers can be read reliably. The repair program does not 
care about the order it works on pages in; it examines each page independently. Unmunge, however, does need the pages in order. 
Unmunging 

After repair has finished its work, the unmunge program strips out the checksums and, based on the page headers, divides the data 
up among various files. Its first argument is the file to unpack. The optional second argument is a manifest file that lists all of the 
files and the directories they go in. Supplying this (an excellent idea) lets unmunge recreate a directory hierarchy and warn about 
missing files. 

When you have unmunged everything and reconstructed the original source code, you are done. Unmunge verifies all of the 
checksums independently of repair, as a sanity check, and you can have high confidence that the files are exactly the same as the 
originals that were printed. 


4 BOOTSTRAPPING 


There’s a problem using the postprocessing tools to correct OCR errors, when the code being OCRed is the tools themselves. We’ve 
tried to provide a reasonably easy way to get the system up and running starting from nothing but a copy of OmniPage. 

You could just scan all of the tools in, correct any errors by hand, delete the error-checking information in a text editor, and compile 
them. But finding all the errors by hand is painful in a body of code that large. With the aid of perl (version 5), which provides a 
lot of power in very little code, we have provided some utilities to make this process easier. 

The first-stage bootstrap is a one-page perl script designed to be as small and simple as possible, because you'll have to hand-correct 
it. It can verify the checksums on each line, and drop you into the editor on any lines where an error has occurred. It also knows 
how to strip out the visible spaces and tabs, how to correct spacing errors after visible tab characters, and how to invoke an editor 
on the erroneous line. 

Scan in the first-stage bootstrap as carefully as possible, using OmniPage’s warnings to guide you to any errors, and either use a 
text editor or the one-line perl command at the top of the file to remove the checksums and convert any funny printed characters 
to whitespace form. 

The first thing to do is try running it on itself, and correct any errors you find this way. Note that the script writes its output to 
the file named in the page header, so you should name your hand-corrected version differently (or put it in a different directory) to 
avoid having it overwritten. 

The second-stage bootstrap is a much denser one-pager, with better error detection; it can detect missing lines and missing pages, 
and takes an optional second argument of a manifest file which it can use to put files in their proper directories. It’s not strictly 
necessary, but it’s only one more (dense) page and you can check it against itself and the original bootstrap. 

Both of the botstrap utilities can correct tab spacing errors in the OCR output. Although this doesn’t matter in most source code, 
it is included in the checksums. 

Once you have reached this point, you can scan in the C code for repair and unmunge. The C unmunge is actually less friendly than 
the bootstrap utilities, because it is only intended to work with the output of repair. It is, however, much faster, since computing 
CRCs a bit at a time in an interpreted language is painfully slow for large amounts of data. It can also deal with binary files printed 
in radix-64. 


5 PRINTING 


Despite the title of this book, this process of producing a book is not well documented, since it’s been evolving up to the moment 
of publication. There, is, however, a very useful working example of how to produce a book (strikingly similar to this book) in the 
example directory, all controlled by a Makefile. 


Briefly, а master perl script called psgen takes three parameters: a file list, а page numbers file to write to, and а volume number 
(which should always be 1 for a one-volume book). It runs the listed files through the munge utility, wraps them in some simple 
PostScript, and prepends a prolog that defines the special characters and PostScript functions needed by the text. 

The file list also includes per-file flags. The most important is the text /binary marker. Text files can also have а tab width specified, 
although munge knows how to read Emacs-style tab width settings from the end of a source file. 

The prolog is assembled from various other files and defines by psgen using a simple preprocessor called yapp (Yet Another 
Preprocessor). This process includes some book-specific information like the page footer. 

Producing the final PostScript requires the necessary non-standard fonts (Futura for the footers and OCRB for the code) and the 
psutils package, which provides the includeres utility used to embed the fonts in the PostScript file. The fonts should go in the 
books/ps directory, as “Futura.pfa” and the like. 

The pagenums file can be used to produce a table of contents. For this book, we generated the front matter (such as this chapter) 
separately, told psgen to start on the next page after this, and concatenated the resultant PostScript files for printing. The only 
trick was making the page footers look identical. 


6 Notes 


This PDF has been created by Martin Monperrus іп 2020, based on the content of "ocr-tools.zip", discussions with Philip В. 
Zimmermann and modifications by Michele Guerini Rocco. 
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This book grew out of a project to publish source code for cryptographic 
software, namely PGP (Pretty Good Privacy), a software package for the 
encryption of electronic mail and computer files. PGP is the most widely 
used software in the world for email encryption. Pretty Good Privacy, Inc 
(or "PGP") has published the source code of PGP for peer review, a long- 
standing tradition in the history of PGP. The first time a fully implemented 
cryptographic software package was published in its entirety in book form 
was "PGP Source Code and Internals," by Philip Zimmermann, published by The 
MIT Press, 1995, ISBN 0-262-24039-4. 


Peer review of the source code is important to get users to trust the 
software, since any weaknesses can be detected by knowledgeable experts who 
make the effort to review the code. But peer review cannot be completely 
effective unless the experts conducting the review can compile and test the 
software, and verify that it is the same as the software products that are 
published electronically. To facilitate that, PGP publishes its source code 
in printed form that can be scanned into a computer via OCR (optical 
character recognition) technology. 


Why not publish the source code in electronic form? As you may know, 
cryptographic software is subject to U.S. export control laws and 
regulations. The new 1997 Commerce Department Export Administration 
Regulations (EAR) explicitly provide that "A printed book or other printed 
material setting forth encryption source code is not itself subject to the 
EAR." (see 15 C.F.R. 734.3(b)(2)). PGP, in an overabundance of caution, 
has only made available its source code in a form that is not subject to 
those regulations. So, books containing cryptographic source code may be 
published, and after they are published they may be exported, but only 
while they are still in printed form. 


Electronic commerce on the Internet cannot fully be successful without 
strong cryptography. Cryptography is important for protecting our privacy, 
civil liberties, and the security of our personal and business transactions 
in the information age. The widespread deployment of strong cryptography 
can help us regain some of the privacy and security that we have lost due 
to information technology. Further, strong cryptography (in the form of 
PGP) has already proven itself to be a valuable tool for the protection of 
human rights in oppressive countries around the world, by keeping those 
governments from reading the communications of human rights workers. 


This book of tools contains no cryptographic software of any kind, nor does 
it call, connect, nor integrate in any way with cryptographic software. But 
it does contain tools that make it easy to publish source code in book form. 
And it makes it easy to scan such source code in with OCR software rapidly 
and accurately. 


Philip Zimmermann 
prz@mit.edu 


November 1997 


# INTRODUCTION 


This book contains tools for printing computer source code on paper in 
human-readable form and reconstructing it exactly using automated tools. 
While standard OCR software can recover most of the graphic characters, 
non-printing characters like tabs, spaces, newlines and form feeds cause 
problems. 
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source code. The two-dimensional indentation structure of source code is 
very important to its comprehensibility. In some cases, distinctions 
between non-printing characters are critical: the standard make utility 
will not accept spaces where it expects to see a tab character. 


Producing a byte-for-byte identical copy of the original is also valuable 
for authentication, as you can verify a checksum. 


There are five problems we have addressed: 


Getting good OCR accuracy. 

Preserving whitespace. 

Preserving lines longer than can be printed on the page. 
Dealing with data that isn't human-readable. 

Detecting and correcting any residual errors. 
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The first problem is partly addressed by using a font designed for OCR 
purposes, OCR-B. OCR-A is a very ugly font that contains only the digits 0 
through 9 and a few special punctuation symbols. OCR-B is a very readable 
monospaced font that contains a full ASCII set, and has been popular as a 
font on line printers for years because it distinguishes ambiguous 
characters and is clear even if fuzzy or distorted. 


The most unusual thing about the OCR-B font is the way that it prints a 
lower-case letter 1, with a small hook on the bottom, something like an 
upper-case L. This is to distinguish it from the numeral 1. We also made 
some modifications to the font, to print the numeral 0 with a slash, and 
to print the vertical bar in a broken form. Both of these are such common 
variants that they should not present any intelligibility barrier. Finally, 
we print the underscore character in a distinct manner that is hopefully 
not visually distracting, but is clearly distinguishable from the minus 
sign even in the absence of a baseline reference. 


The most significant part of getting good OCR accuracy is, however, using 
the OCR tools well. We've done a lot of testing and experimentation and 
present here a lot of information on what works and what doesn't. 


To preserve whitespace, we added some special symbols to display spaces, 
tabs, and form feeds. A space is printed as a small triangular dot 
character, while a hollow rightward-pointing triangle (followed by blank 
spaces to the right tab stop) signifies a tab. A form feed is printed as 
a yen symbol, and the printed line is broken after the form feed. 


Making the dot triangular instead of square helps distinguish it from a 
period. To reduce the clutter on the page and make the text more readable, 
the space character is only printed as a small dot if it follows a blank 
on the page (a tab or another space), or comes immediately before the end 
of the line. Thus, the reader (human or software) must be able to 
distinguish one space from no spaces, but can find multiple spaces by 
counting the dots (and adding one). 


The format is designed so that 80 characters, plus checksums, can be 
printed on one line of an 8.5x11" (or A4) page, the still-common punched 
card line length. Longer lines are managed with the simple technique of 
appending a big ugly black blob to the first part of the line indicating 
that the next printed line should be concatenated with the current one 
with no intervening newline. Hopefully, its use is infrequent. 


While ASCII text is by far the most popular form, some source code is not 
readable in the usual way. It may be an audio clip, a graphic image bitmap, 
or something else that is manipulated with a specialized editing tool. For 
printing purposes, these tools just print any such files as a long string 


е82510 of gibberish in а 64-character set designed to be easy to OCR unambiguously. 
408089 Although the tools recognize such binary data and apply extra consistency 
167b20 checks, that can be considered a separate step. 
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Finally, the problem of residual errors arises. OCR software is not perfect, 
and uses a variety of heuristics and spelling-check dictionaries to clean up 
any residual errors in human-language text. This isn't reliable enough for 
Source code, so we have added per-page and per-line checksums to the printed 
material, and a series of tools to use those checksums to correct any 
remaining errors and convert the scanned text into a series of files again. 


This "munged" form is what you see in most of the body of this book. We 
think it does a good job of presenting source code in a way that can be read 
easily by both humans and computers. 


The tools are command-line oriented and a bit clunky. This has a purpose 
beyond laziness on the authors' parts: it keeps them small. Keeping them 
small makes the "bootstrapping" part of scanning this book easier, since you 
don't have the tools to help you with that. 


# SCANNING 


Our tests were done with OmniPage 7.0 on a Power Macintosh 8500/120 and an 
HP ScanJet 4c scanner with an automatic document feeder. The first part of 
this is heavily OmniPage-specific, as that appears to be the most widely 
available OCR software. 


The tools here were developed under Linux, and should be generally portable 
to any Unix platform. Since this book is about printing and scanning source 
code, we assume the readers have enough programming background to know how 
to build a program from a Makefile, understand the hazards of CR, LF or CRLF 
line endings, and such minor details without explicit mention. 


The first step to getting OrnniPage 7 to work well is to set it up with 
options to disable all of its more advanced features for preserving font 
changes and formatting. Look in the Seffings menu. 


* Create a Zone Contents File with all of ASCII in it, plus the extra 
“bullet, currency, yen and pilcrow symbols. Name it "Source Code". 

* Create a Source Code style set. Within it, create a Source Code zone style 
“апа make it the default. 

Set the font to something fixed-width, like Courier. 

Set a fixed font size (10 point) and plain text, left-aligned. 

Set the tab character to a space. 

Set the text flow to hard line returns. 

Set the margins to their widest. 

The font mapping options are irrelevant. 


+ + х ж ж X 


Go То the settings panel and: 


* Under Scanner, set the brightness to manual. With careful setting of the 
‘threshold, this generates much better results than either the automatic 
-threshold or the 3D OCR. Around 144 has been а good setting for us; you 
“пау want to start there. 

ж Under OCR, you'll build a training file to use later, but turn off 
“automatic page orientation and select your Source Code style set in the 
-Output Options. Also set a reasonable reject character. (For test, we 
--used the pi symbol, which came across from the Macintosh as a weird 
*'sequence, but you can use anything as long as you make the appropriate 
‘definition in subst.c.) 


Do an initial scan of a few pages and create a manual zone encompassing 
all of the text. Leave some margin for page misalignment, and leave space 


8b5427 on the sides for the left-right shift caused by the book binding being in 
6e80cf different places on odd and even pages. 
c6af5a 
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Set the Zone Contents and the Style set to the Source Code settings. After 
setting the Style Set, the Zone Style should be automatically set correctly 
(since you set Source Code as the default). 


Then save the Zone Template, and in the pop-up menu under the Zone step on 
the main toolbar you can now select it. 


Now we're ready to get characters recognized. The first results will be 
terrible, with lots of red (unrecognizable) and green (suspicious) text in 
the recognized window. Some tweaking will improve this enormously. 


The first step is setting a good black threshold. Auto brightness sets the 
threshold too low, making the character outlines bleed and picking up a lot 
of glitches on mostly-blank pages. Try training OCR on the few pages you've 
scanned and look at the representative characters. Adjust the threshold so 
the strokes are clear and distinct, neither so thin they are broken nor so 
think they smear into each other. The character that bleeds worst is 
lowercase w, while the underscore and tab symbols have the thinnest lines 
that need worry. 


You'll have to re-scan (you can just click the AUTO button) until you get 
satisfactory results. 


The next step is training. You should scan a significant number of pages 
and teach OmniPage about any characters it has difficulty with. There are 
several characters which have been printed in unusual ways which you must 
teach OmniPage about before it can recognize them reliably. We also have 
some characters that are unique, which the tools expect to be mapped to 
specific Latin-1 characters to be processed. 


They characters most in need of training are as follows: 


ж Zero is printed 'slashed.' 

Lowercase L has a curled tail to distinguish it clearly from other 
--vertical characters like 1 and I. 

* The or-bar or pipe symbol '|' is printed "broken" with a gap in the 
-middle to distinguish it similarly. 

* The underscore character has little "serifs" on the end to distinguish 
it from a minus sign. We also raised it a just a tad higher than the 
-normal underscore character, which was too low in the character cell to 
“Бе reliably seen by OmniPage. 

* Tabs are printed as a hollow right-pointing triangle, followed by blanks 
*to the correct alignment position. If not trained enough, OmniPage 
--guesses this is a capital D. You should train OmniPage to recognize this 
-symbol as a currency symbol (Latin-1 244). 

* Any spaces in the original that follow a space, or a blank on the printed 
“раве, are printed as a tiny black triangle. You should train OmniPage to 
recognize this as a center dot or bullet (Latin-1 267). We didn't use a 
-standard center dot because OmniPage confused it with a period. 

* Any form feeds in the original are printed as a yen currency symbol 
(Latin-1 245). 

* Lines over 80 columns long are broken after 79 columns by appending a big 
-ugly black block. You should train OmniPage to recognize this as а 
-pilcrow (paragraph symbol, Latin-1 266). We did this because after 
-deciding something black and visible was suitable, we found out the font 
--we used doesn't have a pilcrow in it. 


The zero and the tab character, because of their frequency, deserve special 
attention. 


In addition, look for any unrecognized characters (in red) and retrain those 
pages. If you get an unrecognized character, that character needs training, 


elef48 but Саеге says that "good examples” are best to train on, so if the training 
94cd70 doesn't recognize a slightly fuzzy К, and there's a nice crisp К available 
e390bb to train on, use that. 
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Other things that need training: 


* ~ (tilde), ^ (caret), “ (backquote) and ' (quote). These get dropped 
--frequently unless you train them. 

* i, j and; (semicolon). These get mixed up. 

* 3 and S. These also get mixed up. 

* Q can fail to be recognized. 

* C and [ can be confused. 

ж C/C, 0/0, p/P, 5/5, u/U, v/V, w/W, y/Y апа 2/7 are often confused. This 
--can be helped by some training. 

* r gets confused with c and n. I don't understand c, but it happens. 

* f gets confused with i. 


The OCR training pages have lots of useful examples of troublesome 
characters. Scan a few pages of material, training each page, then scan a 
few dozen pages and look for recognition problems. Look for what OmniPage 
reports as troublesome, and when you have the repair program working, use 
it to find and report further errors. Train a few pages particularly dense 
in problems and append the troublesome characters to the training file, the 
re-recognize the lot. 


Double-check your training file for case errors. It's easy to miss the shift 
key in the middle of a lot of training and will result in terrible results 
even though OmniPage won't report anything amiss. We have spent a while 
wondering why OmniPage wasn't recognizing capital S or capital W, only to 
find that OmniPage was just doing what it was trained to do. 


We have heard some reports that OmniPage has problems with large training 
files. We have observed OmniPage suffering repeatable internal errors 
sometimes after massive training additions, but they were cured by deleting 
a few training images. Appending more training images to the training file 
did not cause the problem to re-appear. 


Repairing the OCR results 


If the only copy of the tools you have is printed in this book, see the next 
chapter on bootstrapping at this point. Here, we assume that you have the 
tools and they work. 


When you have some reasonable OCR results, delete any directory pages. With 
no checksum information, they just confuse the postprocessing tools. (The 
tools will just stop with an error when they get to the "uncorrectable" 
directory name and you'll have to delete it then, so it's not fatal if you 
forget.) Copy the data to a machine that you have the repair and unmunge 
utilities on. 


The repair utility attempts automatic table-driven correction of common 
scanning errors. You have to recompile it to change the tables, but are 
encouraged to if you find a common problem that it does not correct reliably. 
If it gets stuck, it will deposit you into your favorite editor on or 
slightly after the offending line. (The file you will be editing is the 
unprocessed portion of the input.) After you correct the problem and quit 
the editor, repair will resume. 


"Your favorite editor" is taken from the $VISUAL and $EDITOR environment 
variables, or the -e option to repair. 


The repair utility never alters the original input file. It will produce 
corrected output for file in file.out, and when it has to stop, it writes 
any remaining uncorrected input back out to file.in (via a temporary 

file.dump) and lets you edit this file. If you re-run repair on file and 


77d516 file.in exists, repair will restart from there, so you may safely quit and 
8f35ed re-run repair as often as you like. (But if you change the input file, you 
628401 need to delete the .in file for repair to notice the change.) 
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Statistics оп гераіг'5 work are printed to file.log. This is an excellent 
place to look to see if any characters require more training. 


As it works, repair prints the line it is working on. If you see it make a 
mistake or get stuck, you can interrupt it (control-C or whatever is 
appropriate), and it will immediately drop into the editor. If you interrupt 
it a second time, it will exit rather than invoking the editor. If the 
editor returns a non-zero result code (fails), repair will also stop. (Е.2. 
іса in міт.) 


One thing that repair fixes without the least trouble is the number of 
spaces expected after a printing tab character. It's such an omnipresent OCR 
software error that repair doesn't even log it as a correction. 


In some cases, repair can miscorrect a line and go on to the next line, 
possibly even more than once, finally giving up a few lines below the actual 
error. If you are having trouble spotting the error, one helpful trick is to 
exit the editor and let repair try to fix the page again, but interrupt it 
while it is still working on the first line, before it has found the 
miscorrection. 


The Nasty Lines 


Some lines of code, particularly those containing long runs of underscore or 
minus characters, are particularly difficult to scan reliably. The repair 
program has a special "nasty lines” feature to deal with this. If a file 
named "nastylines" (or as specified by the -1 option) exists, they are 
checksummed and are considered as total replacements for any input line with 
the same checksum. So, for example, if you place a blank line in the 
nastylines file, any scanner noise on blank lines will be ignored. 


The "nastylines" file is re-read every time repair restarts after an edit, 
so you can add more lines as the program runs. (The error-correction patterns 
should be done this way, too, but that'll have to wait for the next release.) 


Sortpages 


If, in the course of scanning, the pages have been split up or have gotten 
out of order, a perl script called sortpages can restore them to the proper 
order. It can merge multiple input files, discard duplicates, and warns about 
any missing pages it encounters. This script requires that the pages have 
been repaired, so that the page headers can be read reliably. The repair 
program does not care about the order it works on pages in; it examines each 
page independently. Unmunge, however, does need the pages in order. 


Unmunging 


After repair has finished its work, the unmunge program strips out the 
checksums and, based on the page headers, divides the data up among various 
files. Its first argument is the file to unpack. The optional second argument 
is a manifest file that lists all of the files and the directories they go 
in. Supplying this (an excellent idea) lets unmunge recreate a directory 
hierarchy and warn about missing files. 


When you have unmunged everything and reconstructed the original source code, 
you are done. Unmunge verifies all of the checksums independently of repair, 
as a sanity check, and you can have high confidence that the files are 
exactly the same as the originals that were printed. 


599940 # BOOTSTRAPPING 
40af5a 
d742ee There's a problem using the postprocessing tools to correct OCR errors, when 
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the code being OCRed 15 the tools themselves. Ме'ме tried to provide а 
reasonably easy way to get the system up and running starting from nothing 
but a copy of OmniPage. 


You could just scan all of the tools in, correct any errors by hand, delete 
the error-checking information in а text editor, and compile them. But 
finding all the errors by hand 15 painful in а body of code that large. 
With the aid of perl (version 5), which provides а lot of power in very 
little code, we have provided some utilities to make this process easier. 


The first-stage bootstrap is а one-page perl script designed to be as small 
and simple as possible, because you'll have to hand-correct it. It can verify 
the checksums on each line, and drop you into the editor on any lines where 
an error has occurred. It also knows how to strip out the visible spaces and 
tabs, how to correct spacing errors after visible tab characters, and how to 
invoke an editor on the erroneous line. 


Scan in the first-stage bootstrap as carefully as possible, using OmniPage's 
warnings to guide you to any errors, and either use a text editor or the 
one-line perl command at the top of the file to remove the checksums and 
convert any funny printed characters to whitespace form. 


The first thing to do is try running it on itself, and correct any errors you 
find this way. Note that the script writes its output to the file named in 
the page header, so you should name your hand-corrected version differently 
(or put it in a different directory) to avoid having it overwritten. 


The second-stage bootstrap is a much denser one-pager, with better error 
detection; it can detect missing lines and missing pages, and takes an 
optional second argument of a manifest file which it can use to put files 
in their proper directories. It's not strictly necessary, but it's only one 
more (dense) page and you can check it against itself and the original 
bootstrap. 


Both of the botstrap utilities can correct tab spacing errors in the OCR 
output. Although this doesn't matter in most source code, it is included 
in the checksums. 


Once you have reached this point, you can scan in the C code for repair and 
unmunge. The C unmunge is actually less friendly than the bootstrap 
utilities, because it is only intended to work with the output of repair. 
It is, however, much faster, since computing CRCs a bit at a time in an 
interpreted language is painfully slow for large amounts of data. It can 
also deal with binary files printed in radix-64. 


# PRINTING 


Despite the title of this book, this process of producing a book is not well 
documented, since it's been evolving up to the moment of publication. There, 
is, however, a very useful working example of how to produce a book 
(strikingly similar to this book) in the example directory, all controlled 
by a Makefile. 


Briefly, a master perl script called psgen takes three parameters: a file 
list, a page numbers file to write to, and a volume number (which should 
always be 1 for a one-volume book). It runs the listed files through the 
munge utility, wraps them in some simple PostScript, and prepends a prolog 
that defines the special characters and PostScript functions needed by the 
text. 


cf343d The file list also includes per-file flags. The most important is the 
02262Ғ text/binary marker. Text files can also have а tab width specified, although 
4afc88 munge knows how to read Emacs-style tab width settings from the end of а 
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source file. 


The prolog is assembled from various other files and defines by psgen using 
a simple preprocessor called yapp (Yet Another Preprocessor). This process 
includes some book-specific information like the page footer. 


Producing the final PostScript requires the necessary non-standard fonts 
(Futura for the footers and OCRB for the code) and the psutils package, 
which provides the includeres utility used to embed the fonts in the 
PostScript file. The fonts should go in the books/ps directory, as 
"Futura.pfa" and the like. 


The pagenums file can be used to produce a table of contents. For this book, 
we generated the front matter (such as this chapter) separately, told psgen 
to start on the next page after this, and concatenated the resultant 
PostScript files for printing. The only trick was making the page footers 
look identical. 
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#!/usr/bin/env perl -s 

# 

# bootstrap -- Simpler version of unmunge for bootstrapping 

# 

# Unmunge this file using: 

# --perl -ne ‘if (s/* *[^-Ns]NS(4,6) 2//) € $/[\244\245\267]/ /g; print; }' 
# 

# $Id: bootstrap,v 1.15 1997/11/14 03:52:53 mhw Exp $ 


sub FatalA{ print STDERR @_; -exit(1); } 
sub MaxAA{ my (За, $b) = @ ; "(фа > $b) 2 За: $b; ) 
sub TabSkipA{ $tabWidth - 1 - (length($ [0]) % $tabWidth); } 


($tab, $yen, $pilc, $cdot, $tmp1, $tmp2)=("\244",, "X245" , "X266" , "\267", "\377", "\376"); 


$editor = $ENV('VISUAL') || $ENV('EDITOR') || 'vi'; 
$inFile = $ARGV[0]; 
doFile: ( 


“ореп(1М, "«$inFile") || die; 

“Рог ($lineNum = 1; ($. = <IN>); $lineNum++) { 

As/*\s+//; -$/\$+$//; АЯ Strip leading and trailing spaces 
Anext if (/*$/);AA# Ignore blank lines 

A($prefix, $seenCRCStr, $dummy, 5 ) = /^(NS(2)) (504) ( (.*))?/; 


A# Correct the number of spaces after each tab 

Awhile (s/$tab( *)/$tmp1 . ($tmp2 x &Max(length($1), &TabSkip($')))/e) {} 
As/ ( +)/" " . ($cdot x length($1))/eg; АЯ Correct center dots 
As/$tmpl/$tab/g; :s/$tmp2/ /g; -# Restore tabs and spaces from correction 
As/Ns*$/Nn/; АА Strip trailing spaces, and add a newline 


A$crc = $seenCRC = 0;AAA# Calculate CRC 

Afor ($data = 5 ; $data ne ""; $data = substr($data, 1)) { 

А::“%сгс ^= ord($data) ; 

A::-for (1..8) 4 

АД$сгс = ($crc >> 1) ^ (($crc & 1) 2 0x8408 : 0), 

As) 

A) 

Aif ($crc != hex($seenCRCStr)) {AA# CRC mismatch 
**close(IN); -close(OUT); 

suntin Cori ба дак 

--@filesCreated = (); 

--@oldStat = stat($inFile); 

--system($editor, "+$lineNum”, $inFile); 

--@newStat = stat($inFile); 

--redo doFile if ($oldStat[9] != $newStat[9]); "Я Check mod date 

*:&Fatal("Line $lineNum invalid: 5 "); 


if ($prefix eq '--') {AAA# Process header line 

-- ($code, $pageNum, $file) = /*(\S{19}) Page (Маю) of (.х)/; 
**$tabWidth = hex(substr($code, 11, 1)); 

--if ($file пе $lastFile) 4 

AAprint "$fileWn"; 

AA&Fatal("$file: already exists\n”) if (!$f 88 (-e $file)); 
AAclose(OUT) ; 

AAopen(OUT, ">$file”) || &Fatal("$file: $!\n"); 
AApush(@filesCreated, ($lastFile = $file)); 

As 

А) else {AAAA# Unmunge normal line 

**s/$tab( х)/"МЕ".(" " x (length($1) - &TabSkip($*)))/eg; 
-+s/$yen\n/\f/; АЯ Handle form feeds 

-+s/$pilc\n//; АЯ Handle continuation lines 

-+s/$cdot/ /g;A# Center dots -> spaces 


РРРЕ БРРРРРРРРЕ 


bbb > 


a6af5a 
24b351 A---print OUT; 
db8350 A) 
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9f6fe7 .:::) 
30ca06 ----close(IN); -close(OUT); 
Obefe6 } 


--6172 


a273b6 
cfa601 
264352 
31a601 
739cbb 
@baf5a 
0a4f7b 
943585 
ade27e 
907578 


4385323 ::: 


0b935a 
bd2db1 


972cde ·· · · 
917c59 --. 


6faf5a 
f44cdd 
0c3067 
26a6f7 
ed7688 
359970 
88e3e3 
47dcf9 
83efe6 
564e2f 
83779a 
2f2616 
28c787 
237167 
21239e 
68069f 
4f8853 
c12c25 
e3db1f 
6ba546 
09af5a 
c83831 
f026b2 
4911e0 
963d68 
b6853a 
4a8350 
9e17ab 
029671 
4f0804 
0485bb 
9509ae 
440655 
19f97c 
66c938 
bd9bed 
30336c 
30c8cd 
523167 
2e7308 
650b55 
a15381 
ba6359 
5194fe 
376043 
cb3e90 
993574 
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#!/usr/bin/env perl -s 

# 

# bootstrap2 -- Second stage bootstrapper, a version of unmunge 
# 

# $Id: bootstrap2,v 1.4 1997/11/14 03:52:54 mhw Exp $ 


sub CleanupA{ close(IN); -close(OUT); -unlink(@files); -@files = (); } 

sub FatalA{ &Cleanup(); ‘print STDERR @_; -exit(1); } 

sub TabSkipA{ $tabWidth - 1 - (length($ [0]) % $tabWidth); } 

sub TabFixA( my ($needed, $actual) = (&TabSkip($ [0]), length($ [1])); 

-$tmp1 . ($tmp2 x $needed) . (" " x ($actual - $needed)); } 

sub HumanEditA{ my ($file, $line, @message) = ($inFile, 8 ); -&Cleanup(); 
-++-@old = stat($file); -system($editor, "+$line”, $file); -@new = stat($file); 
redo doFile if ($old[9] != $new[9]); As Check mod date 

-&Fatal("Line $line, ", @message); } 


($tab, $yen, $pilc, $cdot, $tmp1, $tmp2)=("\244",, "\245", "1266", "1267", "X377" ,"X376") ; 
$editor = $ENV('VISUAL') || $ENV('EDITOR') || "мі"; 

($inFile, $manifest, @rest) = @ARGV; 

if ($manifest ne "") (AA: Read manifest file 

***-open(MANIFEST, "«$manifest") || &Fatal("$manifest: $!\n"); 

while («MANIFEST») { $dir = $1 if /D\s+(.x)$/; 

A$index[$1] = $dir . $2 if /^(\9+)\$+(.*)$/; } 

2 

до ле: { 

-+++$seenPCRC = $рсгс1 = 0; -$lastFlags = 1; -$lastFileNum = 0; 
----open(IN, "«$inFile") || &Fatal("$inFile: $!\n"); 

for ($line = 1; (5 = <IN>); $1іпе++) { 

As/*\s+//; -$/\$+$//; АЯ Strip leading and trailing spaces 

Anext if (/*$/);AA# Ignore blank lines 

A($prefix, $seenCRCStr, $dummy, 5 ) = /^(NS(2)) (504) ( (.*))?/; 
Awhile (s/$tab( *)/&TabFix($', $1)/eo) (ў -# Correct spaces after tabs 
As/($tmp2| )( +)/$1 . ($cdot x length($2))/ego; Ast Correct center dots 
As/$tmp1/$tab/go; :s/$tmp2/ /go; : Restore tabs/spaces from correction 
As/Ns*$/Nn/; AA# Strip trailing spaces, and add a newline 


A$crc = 0; -$рсгс = $pcrc1; AAs Calculate CRCs 

Afor ($data = $ ; $data ne ""; $data = substr($data, 1)) { 
А::“%сгс ^= ord($data); “%рсгс1 “= ord($data); 

A:::for (1..8) { $crc = ($сгс >> 1) ^ (($crc 8 1) 2 0x8408 : 0); 
АД$рсгс1 = ($percl >> 1) ^ (($рсгс1 8 1) 2 0xedb88320 : 0); } 

A) 

A($seenPLCRC, $seenCRC) = map ( hex($ ) } ($prefix, $seenCRCStr); 
A&HumanEdit($line, "CRC failed: $ ") if $crc !- $seenCRC; 

Aif ($prefix eq '--') {AAA# Process header line 
A---&HumanEdit($line - 1, "Page CRC failed") if $pcrc != $seenPCRC; 
Д... (ФћитапНаг, $pageNum, $file) = /*\S{19} (Page (\d+) of (.х))/; 
A---($vers, $flags, $seenPCRC, $tabWidth, $prodNum, $fileNum) = 
AAmap { hex($.) } /^(NS) (NSNS) (1987) (15) (\5{3}) (\5{4})/; 

A---if ($fileNum != $lastFileNum) { 

AAprint STDERR "MISSING files\n” if $fileNum != $lastFileNum + 1; 
AA&Fatal("Missing pagesW") if $pageNum !- 1 || !($lastFlags & 1); 
AAif ($manifest ne "") { 

АА---(5. = $index[$fileNum]) =~ m%([*/]*)$%; 
AA-:-:&Fatal("Manifest mismatch\n”) if ($file ne $1); 

ДД... ($file = 5 ) =~ s|/*|mkdir($', 0777), "/"|еє; "й mkdir -p 
AA} 

AA&Fatal("$file: already exists\n”) if (!$f 88 (-е $file)); 
AAclose(OUT); -ореп(00Т, ">$file”) || &Fatal("$file: $!\n"); 
AApush(@files, $file); ‘print "$fileNum $file\n”; 

Д...) else { 

AA&Fatal("MISSING pages\n”) if ($pageNum != $lastPageNum + 1); 


19becc А: · ·} 
8е4232 A---($lastFlags, $lastFileNum, $lastPageNum) = ($flags,$fileNum, $pageNum) ; 
648c61 A---$pcrcl = 0; 
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595064 A} else {AAAA# Unmunge normal line 

6се479 A---&HumanEdit($line, "CRC failed: 5 ") if (%рсгс1 >> 24) != $seenPLCRC; 
9d4e1d A---s/$tab( *)/"Nt". (" " x (length($1) - &TabSkip($‘)))/ego; 

12e3c8 A---s/$yen\n/\f/o; :s/$pilcNn//o; :s/$cdot/ /go; ‘print OUT; 

188350 A} 

666fe7 ----) 

caefe6 } 
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0b326b #!/usr/bin/env perl 

9eaf5a 

71b379 $fileNum - 0; 

095061 while(<>) 

debb36 ( 

7fbc25 A/^(EVDTB])(NS*)Ns*(.*)/ || die("Bad filelist, line $."); 
bcafda A($type, $options, $name) - ($1, $2, $3); 
3eaf5a 

212484 Aif ($type eq "D") 

edd780 А 

cb5941 AA$dir = $name; 

1a0ae9 AAprint "D $dir\n”; 

788350 A} 

2ele3a Aelsif ($type eq "V") 
914780 А 

ffacc4 AA# Do nothing 

cd8350 A} 

24c252 Aelse 

564780 АС 

723968 AA$fileNum++; 

dicefb AA$tail = $name; 

8ааа24 AA$tail =~ s|*.*/||; 

cd60d5 AAdie("Bad filelist, line $.") if $name пе $dir . $tail; 
14асс1 AAprint "$fileNum $tail\n”; 
8e8350 A} 

5lefe6 } 

49af5a 

e9a601 it 

d8fla9 4 vi: ai ts=4 

dd887e # vim: si 

bea601 # 
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0b326b 
442601 
40eb94 
e6a601 
63b470 
5a082c 
bb7bb6 
c3a601 
d95c67 
4b7066 
c6c850 
f0fb32 
b52b5a 
59a6b0 
a3f@bc 
af21f2 
7ea601 
2614bc 
6da601 
elaf5a 
бааа99 
adaf5a 
aef0d2 
cb858c 
526cd4 
41a30a 
87af5a 
a8a784 
d58791 
5a416e 
289bf2 
даас62 
9daf5a 
27931f 
cd3168 
70b2fb 
edd31d 
тадбтд 
496bf8 
4fbb36 
77a3ee 
c6d8a5 
57d780 
429cf7 
£98350 
537d67 
d9d780 
555801 
d58350 
8f7680 
444780 
1924с2 
Ғс8350 
d69ee3 
92d780 
b168ea 
c98350 
81Ғ220 
cbd780 
ef6584 
db8350 
d1da35 
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#!/usr/bin/env perl 
# 
# 
# 
# 
# 


# for each file/dir are put into the file <pagenums>. 
# 


# usage: psgen [ options... ] <filelist> <pagenums> <volume #> -> foo.ps 


#AAA-1<firstLogicalPage> 
#AAA-p<firstPhysicalPage> 

#AAA-f<font> 

#AAA-D<defs> (passed to yapp) 
#AAA-P<productNumber> 

#AAA-o<mungedOutFile> 

#AAA-eAAAA (auto edit errors) 

# 

# $Id: psgen,v 1.18 1997/11/13 21:44:16 colin Exp $ 
# 


use File: :Basename; 


$bookRoot = $ENV("BOOKROOT") || "."; 
$toolsDir = dirname(__FILE__); 
$psDir = "$bookRoot/ps"; 

$editor = $ENV{"EDITOR"} || "мі"; 


# Configuration settings - external file names 
$mungeProg = "$toolsDir/munge” ; 

$yappProg = "$toolsDir/yapp"; 

$preambleFile = "$psDir/prolog.ps"; 

$tempFile = "/tmp/psgen-$$"; 


# Parse arguments 
$firstLogPage = $firstPhysPage = Q; 
$productNumber = 1; 


$font = "Inconsolata"; 
$autoEdit = Q; 

while ($#ARGV >= 0 8% $ARGVLQ] =~ /^-/) 
{ 

A$_ = shift @ARGV; 
Aif (/^--$/) 

At 

AAlast; 

A} 

Aelsif (/*-1(\d+)$/) 
А 

AA$firstLogPage = $1; 
A) 

Aelsif (/*-p(\d+)$/) 
А 

AA$firstPhysPage = $1; 
A} 

Aelsif (/*-f(.+)$/) 
At 

AA$font = $1; 

A} 

Aelsif (/^-D(.*)$/) 
А 

AA$yappDefs .="" . $; 
A} 


Aelsif (/*-P(\d+)$/) 


psgen -- Postscript generator for code portion of source books 


Reads in a list of files/dirs from <filelist>, runs munge on each of 
them, and generates a single postscript file to stdout. -The page numbers 


dad780 А 
d2ecc6 AA$productNumber = $1; 
b28350 A} 
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d3acd4 Aelsif (/*-0(.+)$/) 

abd780 А 

7d66f8 AA$mungedOutFile - $1; 

188350 A} 

e3af43 Aelsif (/^-e$/) 

d8d780 АС 

1e0dac AA$autoEdit = 1; 

868350 A} 

77с252 Aelse 

104780 At 

ec8146 AA&Error("Unrecognized option: '$ '"); 

b38350 A) 

34efe6 ) 

e3622b $fileListFile = shift @ARGV || die "Missing file list argument (arg 1)"; 
4637cc $pageNumFile = shift @ARGV || die "Missing page number file argument (arg 2)"; 
601424 $volume = shift @ARGV || die "Missing volume number argument (arg 3)"; 
21af5a 

97b12a # Determine initial page numbers 

7bbb36 { 

4fObdc Amy $nextLogPage = 1; 

f4a6d2 Amy $nextPhysPage = 3; 

773cce Amy $volNum = 0; AAst Which volume's page numbers we're reading 
07af5a 

dc95d7 Aif ($volume » 1) 

464780 А 

467450 AAopen(OLDPAGENUMS, "«$pageNumFile") || die; 

4073c5 AAwhile (<OLDPAGENUMS>) 

300751 ДА 

854с41 AAAif (/*Volume\s+(\d+)$/) 

с9с085 АЛЛА 

6d6f2d AAAA$volNum = $1; 

be9455 AAA} 

bd38e1 AAAelsif (/*Next:\st+(\d+)\s*$/ 8% $volNum == $volume - 1) 
93с085 АЛА 

5d5c24 AAAA$nextLogPage = $1; 

1а9455 AAA} 

b65381 AA} 

c10713 AAclose(OLDPAGENUMS) ; 

fe8350 A) 

3dc252 Aelse 

a9d780 А 

526473 AAunlink($pageNumFile); 

eb8350 A) 

6567aa A$firstLogPage - $nextLogPage if ($firstLogPage -- 0); 
5a48ca A$firstPhysPage = $nextPhysPage if ($firstPhysPage == 0); 
14efe6 ) 

a9af5a 

fcfcd@ # Names of PostScript operators invoked. ·Тћезе are the interface 
576fa9 # between this file and the $preambleFile. 

ддегде $oddPageStartPS = "OddPageStart”; 

933472 $evenPageStartPS = "EvenPageStart”; 

2dd55c $oddPageEndPS = "OddPageEnd"; 

55e5cf $evenPageEndPS = "EvenPageEnd"; 

a21aa5 $dirPagePS = "DirPage"; 

4e4ce4 # This is short because it's emitted every line 

4a0fel $linePS = "І"; 

93af5a 

54c132 # Handle an error from munge. 

789359 # A result of 0 means to retry, 1 means to exit 

e5aaee sub MungeError 

7bbb36 { 

3acb09 Amy $result = 1; 


16af5a 
d8ddf8 Aopen(FILEH, "«$tempFile") || die; 
0942b3 Awhile (<FILEH>) 


--4e5f 000444c09cd40010001 Page 3 of psgen.pl 


f2d780 Af 

a02c9d AAprint STDERR; 

f3b4ee AAif (/ in (.*) line (\d+)$/) 

0f0751 ААС 

ff540d AAAmy ($fileName, $lineNumber) = ($1, $2); 

1baf5a 

5fedd1 AAAif ($autoEdit) 

9cc085 АЛЛА 

295857 AAAAmy @statResult = stat($fileName); 

02f190 AAAAmy $oldMTime = $statResult[9]; 

54af5a 

d20b43 AAAAsystem("'$editor' '+$lineNumber' '$fileName' 1282"); 
d5c406 AAAA@statResult = stat($fileName); 

312e46 AAAA $result = ($statResult[9] == $oldMTime); 

732e23 AAAAlast; 

dd9455 AAA} 

c05381 AA} 

d28350 A} 

e8d4bb Aclose(FILEH); 

c726e1 Aunlink($tempFile) || die "Couldn't unlink $tempFile"; 
3cc068 Areturn $result; 

5fefe6 } 

9faf5a 

85838b sub CopyFileToPS 

800036 { 

192df4 Alocal $fileName = $ [0]; 

3643ac Alocal $args - "'-I$psDir' '-Dfont-$font'"; 

cef053 Alocal $ ; 

6daf5a 

96130c A$args .- $yappDefs; 

406a0a Aopen(FILEH, "$yappProg $args '$fileName' |") || die; 
6242b3 Awhile (<FILEH>) 

9fd780 Af 

9de0b2 AAprint PSOUT $_; 

218350 A} 

8d2b9f Aclose(FILEH) || exit(1); 

bcf1f3 A1; 

ТаеҒеб } 

18af5a 

a68ea9 # Wrap a string in parens as required by PostScript, with proper quoting. 
9f4d6f sub StringPS 

d8bb36 ( 

1bffc6 Alocal $str = $ [0]; 

ceaf5a 

26e48d A$str =~ s/([\\O1)/\\$1/g; 

d5cabe A"(" . $str . ")"; 

a6efe6 } 

77af5a 

92f470 # Emit a start of page. -The Postscript DSC %%Page: header: 
31235a # (followed by logical page number, then physical) and 
27d1c3 4 the top-of-page function (which is passed the page number as a string) 
@b89c@ sub PageStartPS 


42bb36 ( 

даҒ1бе Alocal $pageNum = $ [0]; 

a2af5a 

е08672 A"XXPage: " . ($pageNum + $firstLogPage) . " " . 


df967c AAAA($pageNum + $firstPhysPage) . "Мп" . 

2d546f AA&StringPS($pageNum + $firstLogPage) . 

с06386 AA((($pageNum + $firstLogPage) % 2) 2 $oddPageStartPS 
9e943b AAAAAAAAAA:: $evenPageStartPS) . "Мп"; 

70efe6 ) 

a2af5a 


649с34 sub PageEndPS 
a3bb36 { 
44f16e Alocal $pageNum = $ [0]; 
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eQaf5a 
10dcd9 
25efe6 
8baf5a 
сдадсд 
4f25ef 
2fbb36 
c9aee9 
48af5a 
bd8f cd 
8eefe6 
43af5a 
1af73e 
1faf5a 
5d4310 
26ae9a 
bb80df 
6a5c67 
68bb36 
08a3db 
3cefe6 
83af5a 
ca3222 
laaf5a 
{21390 
f2af5a 
ba7633 
c866f8 
0a07d1 
b7af5a 
fcd231 
1dbb36 
af7457 
16af5a 
cf5b1d 
afaf5a 
1e7fb5 
6bd780 
525d29 
edb173 
f70751 
e06180 
fb5381 
928350 
беҒҒ64 
bdd780 
8a7cbc 
34d186 
1da474 
985274 
аада70 
аҒра88 
857dbe 
2a8350 
91с252 
e8d780 
900e6a 
7daf5a 
bbcbla 
b@f5ca 
417cbc 
3127fa 
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A((($pageNum + $firstLogPage) % 2) 2 $oddPageEndPS : $evenPageEndPS) . 


} 


# Save the page number to a table-of-contents file 
sub SavePageNum 


( 

Alocal ($fileName, $pageNum) = 6; 

Aprint PAGENUMS ($pageNum + $firstLogPage), ": $fileName\n”; 
j 


# The main code. 


open(PSOUT, "»-") || die; 
open(FILELIST, "<$fileListFile”) || die; 
open(PAGENUMS, ">>$pageNumFile”) || die; 


if ($mungedOutFile ne "") 

{ 

Aopen(MUNGEDOUT, "»$mungedOutFile") || die; 
2 


print PAGENUMS "Volume $volume\n”; 
&CopyFileToPS($preambleFile) ; 


$fileNumber = Q; 


“уп”; 


$pageNum = 0; АЯ This is @-based, since it is added to $first{Log,Phys}Page 


$enable = 0; 

while (<FILELIST>) 

LINO ым || die "Illegal file list line 5."; 
Alocal ($fileType, $options, $arg) = ($1, $2, $3); 


Aif ($fileType eq "V") 
A 


AAGargs = split(/\st/, $arg); 

AAif ($enable = ($args[0] == $volume)) 
АА 

AAA$defaultTabWidth = int($args[1]); 
AA} 

A} 

Aelsif ($fileType eq "D") 

А 


AAnext unless $enable; A# Do nothing if we're in the wrong volume 
AA$dirName = $arg; 

AA&SavePageNum($dirName, $pageNum) ; 

AAprint PSOUT &PageStartPS($pageNum) ; 

AAprint PSOUT &StringPS($dirName), $dirPagePS, "n"; 
AAprint PSOUT &PageEndPS($pageNum) ; 

AA$pageNum++ ; 

A) 

Aelse 

А 

AAmy $done = 0; 


AA$fileNumber++ ; 

AA$fileName = $arg; 

AAnext unless $enable; A# Do nothing if we're in the wrong volume 
AA&SavePageNum($fileName, $pageNum) ; 


а7а8Ғ9 AA$quotedFileName = $fileName; 
48b8bd AA$quotedFileName =~ s/'/\\'/g; 
d5a6a0 AA$tabWidth = ($options =~ /(Nd)/) ? $1 : $defaultTabWidth; 


--b0e1 0004c33086840010001 Page 5 of psgen.pl 


bfc49d AA$args = ($fileType eq "B") ? "-b" : ""; 


ffddbc AA$args .= " -$tabWidth -p$productNumber -f$fileNumber"; 
7466c7 AAwhile (!$done) 
cb0751 AA( 


a999df AAAif (open(FILE, "$mungeProg $args '$quotedFileName' 2>$tempFile |")) 
6cc085 АЛЛА 

ecb6d6 AAAA$line = «FILE»; 

1c45c9 AAA Aprint MUNGEDOUT $line; 

91af5a 

f86463 AAAAwhile ($line ne "") 

dd50eb AAAA( 

бассве AAAAAprint PSOUT &PageStartPS($pageNum) ; 

15af5a 

2039c8 AAAAAwhile ($line ne "" and $line !~ /*\f/) 

9ada03 AAAAA( 

9ef002 AAAAAAchop $line; 

cicf39 AAAAAAprint PSOUT &StringPS($line), $linePS, "n"; 

c3f52e AAAAAAS$line = «FILE»; 

3102d8 AAAAAAprint MUNGEDOUT $line; 

e88ed3 AAAAA} 

de7acd AAAAASline =~ s/*\f//; 

b8af5a 

a7928a AAAAAprint PSOUT &PageEndPS($pageNum) ; 

d3762b AAAAAS$pageNum++ ; 

8b043b АДАД) 

23af5a 

ecf64e AAAAif (close(FILE)) 

1450eb AAAA{ 
3c850e AAAAA$done 
07043b AAAA) 
f92f7d AAAAelse 
1a50eb AAAA( 
а58609 AAAAA$done = &MungeError(); 

1d043b AAAA} 

779455 AAA} 

0c2376 AAAelse 

6сс085 АЛЛА 

5d852f AAAA$done = &MungeError(); 

049455 AAA) 

875381 AA} 

5364e7 AAif ($done == 1) 

dc0751 AA( 

905cce AAAdie; 

505381 AA) 

668350 A} 

88efe6 ) 

baaf5a 

4a5a0c й Print PostScript DSC trailer with the correct number of pages 
9ffdf2 print PSOUT "%%Trailer\n%%Pages: ", $pageNum, "\n%%EOF\n"; 
cOaf5a 

78f624 print PAGENUMS "Pages: ", $pageNum, "An"; 

c207e5 print PAGENUMS "Next: ", ((($pageNum+1) 8 71) + $firstLogPage), "An"; 
bcaf5a 

e1095d close(PAGENUMS) || die; 

ea93f8 close(FILELIST) || die; 

08c2e0 close(PSOUT) || die; 


2; 


c9af5a 

745c67 if ($mungedOutFile ne "") 
3cbb36 ( 

f3blee Aclose(MUNGEDOUT) || die; 
6aefe6 ) 


fbaf5a 


cba601 й 
66Ғ1а9 # vi: ai ts=4 
4c887e # vim: si 
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5ea601 4 


--dbd5 


0b326b 
442601 
5a82fe 
44a601 
b3af5a 
50bbf4 
6842b8 
400cca 
85af5a 
73500Ғ 
56bb36 
939583 
9ре9де 
601560 
608Ғ65 
55р20Ғ 
78e49e 
837393 
85b7aa 
801520 
8d5cf8 
34c807 
265603 
14446b 
532e87 
е551а1 
e52c0c 
f4737b 
edf684 
b45327 
a1535f 
ade3e® 
f73146 
55af5a 
d48525 
8dela3 
e2d97a 
Ғ7с085 
f1b483 
a35649 
8e8bb® 
ca9455 
3caf5a 
c00a20 
1ec085 
e9c7d2 
1a9455 
145971 
c3c085 
69a272 
875649 
198bb0 
109455 
70af5a 
d17d85 
17c085 
f02b9c 
d19455 
eec7b8 
8cc085 
8bce10 
665649 
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#!/usr/bin/env perl 

# 

# $Id: sortpages,v 1.8 1997/12/11 19:20:58 mhw Exp $ 
# 


@fileNameFromNumber = (); 
@pagesFound = (); 
$theProductNumber = Q; 


Тог $fileIndex (@..$#ARGV) 

af 

A$fileName = $ARGV[$fileIndex]; 

Aopen(FILE, "«$fileName") || die; 

Awhile (leof(FILE)) 

САС 

--AA$filePos = tell(FILE); 

..АА% = <FILE>; 

"ДАТЕ (/*\F?-\S/) 

ААС 

--AAAmy ($versionHex, $flagsHex, $pageCRCHex, $tabWidthHex, 
AAAAS$productNumberHex, $fileNumberHex, $pageNumber, $name) 
AAAAA:= (/*\f?-\S\S{4}\ ААЖ CRC followed by a space 
AAAAAA ([0-9a-f]) AAAA# Format version 

AAAAAA ([0-9a-f1{2}) AAA# Flags 

AAAAAA ([0-9a-F1{8}) AAA# Running CRC32 
AAAAAA([0-9a-£1) AAAA# Tab width (0 means radix64) 
AAAAAA([0-9a-£1(3)) AAA: Product number 

AAAAAA ([0-9a-F1{4}) AAA# File number 

AAAAAA\ Page\ (\d+)\ ofN (.*)/х); 

AAAmy $version = hex($versionHex) ; 

AAAmy $flags = hex($flagsHex) ; 

AAAmy $productNumber = hex($productNumberHex) ; 
AAAmy $fileNumber = hex($fileNumberHex) ; 


AAAunless ($version == 0 && $productNumber > 0 
AAAAAA&& $fileNumber > 0 88 $pageNumber > 0 
AAAAAA&& $name пе "") 

АЛА 

AAAAprint STDERR "ERROR: Invalid header info ", 


AAAAAAA"at $fileName line $.\n”; 

AAAAexit(1); 

AAA} 

AAAif (Idefined($fileNameFromNumber[$fileNumber])) 
АЛА 

AAAAS$fileNameFromNumber[$fileNumber] = $name; 
AAA} 

AAAelsif ($fileNameFromNumber[$fileNumber] ne $name) 
АЛА 

AAAAprint STDERR "ERROR: Mismatched filename ", 


AAAAAAA"at $fileName line $.\n"; 
AAAAexit(1); 
AAA} 


AAAif (!$theProductNumber) 

AAA 

AAAAS$theProductNumber = $productNumber ; 

AAA} 

AAAelsif ($theProductNumber != $productNumber) 
АЛА 

AAAAprint STDERR "ERROR: Different product number ", 
AAAAAAA"at $fileName line $.\n"; 


168650 AAAAexit(1); 
769455 AAA} 
dbaf5a 
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58692a AAA push @pagesFound, (sprintf "%5d:%4d:%d:%d:%d”, 

9e4eb8 AAAAAS$fileNumber, $pageNumber, $flags, $fileIndex, $filePos); 
635381 AA} 

5а8350 A} 

50c634 Aclose(FILE) || die; 

d@efe6 } 

71af5a 

7519ef @pagesFound = sort @pagesFound; 

45af5a 

дебате $result = 0; 
e8afd8 $lastFileNumber = 0 
4e0f2c $lastPageNumber = 0; 
2de0c8 $nextFileNumber = 1 
22403c $nextPageNumber = 1 
1380f5 $fileIndexOpen = -1; 

9аеа12 foreach (@pagesFound) 

6bbb36 { 

d3ef04 Amy ($fileNumber, $pageNumber, $flags, $fileIndex, $filePos) = split /:/; 
18af5a 

115073 A$fileNumber = int($fileNumber) ; 

e51f9a A$pageNumber = int($pageNumber) ; 

bcaf5a 

92ff40 Aif ($fileNumber == $lastFileNumber && $pageNumber == $lastPageNumber) 
324780 Af 

741058 AAprint STDERR "DUPLICATE: File $fileNumber, page $pageNumber, skipped\n”; 
eb79a4 AAnext; 

4e8350 A) 

07af5a 

9fe123 Aif ($nextFileNumber « $fileNumber && $nextPageNumber !- 1) 

514780 ДАТ 

0e3ef6 AAprint STDERR "MISSING: File $nextFileNumber, ", 

bc78d7 AAAAA"pages $nextPageNumber - END\n”; 

ac3bb8 AA$nextPageNumber = 1; 

38c333 AA$nextFileNumber++; 

bib4ef AA$result = 1; 

1a8350 A) 

168464 Aif ($nextFileNumber « $fileNumber) 

544780 А 

1a41d4 AAprint STDERR "MISSING: Files $nextFileNumber - ", 

485404 AAAAA$fileNumber-1, "n"; 

давззе AA$nextFileNumber = $fileNumber; 

f43bb8 AA$nextPageNumber = 1; 

Qeb4ef AA$result = 1; 

3e8350 A} 

6b5635 Aif ($nextFileNumber !- $fileNumber) 

844780 А 

60d88a AAprint STDERR "ERROR: Internal error, unexpected fileNumber\n”; 

896едд AAexit(1); 

278350 A} 

d8af5a 

2e3983 Aif ($nextPageNumber < $pageNumber) 

37d780 А 

7d52ba AAprint STDERR "MISSING: File $fileNumber, pages $nextPageNumber - ", 
а95е14 AAAAA$pageNumber-1, "Nn"; 

0b3ed9 AA$nextPageNumber = $pageNumber; 

42b4ef AA$result = 1; 

f48350 A) 

08790d Aif ($nextPageNumber !- $pageNumber) 

220780 А 

1a0e7f AAprint STDERR "ERROR: Internal error, unexpected pageNumber\n”; 

d36e00 AAexit(1); 

788350 A} 


acaf5a 
436543 Aif ($fileIndexOpen != $fileIndex) 
504780 АС 
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9aa89d AAif ($fileIndexOpen >= 0) 

d70751 AA( 

abbdbó AAAclose(FILE) || die; 

846еа7 AAA$fileIndexOpen = -1; 

e25381 AA) 

4ceb47 AA$fileName = $ARGV[$fileIndex]; 
82de3d AAopen(FILE, "«$fileName") || die; 
92a93e AA$fileIndexOpen = $fileIndex; 

f48350 A) 

a5704f Aseek(FILE, $filePos, 0) || die($!); 
f7af5a 

f5ec52 A$_ = «FILE»; 

d86e4f Aprint; 

aa2a4b Awhile («FILE») 

514780 А 

7diab3 AAlast if /*\f?-\S/; 

£46244 AAprint; 

4с8350 А) 

945a4e A$lastFileNumber = $fileNumber; 

bee7a9 A$lastPageNumber = $pageNumber; 

55af5a 

6ecfd7 Aif ($flags & 1)AA# Bit @ of flags indicates last page of file 
43d780 А( 

97c333 AA$nextFileNumber++; 
6f3bb8 AA$nextPageNumber = 1; 
518350 A} 

abc252 Aelse 

244789 А 

2ebbe3 AA$nextPageNumber++; 
7е8350 A} 

67efe6 } 

87af5a 

e2172e if ($nextPageNumber != 1) 
78bb36 { 

870a5a Aprint STDERR "MISSING: File $nextFileNumber, ” 
441043 AAAA"pages $nextPageNumber - END\n”; 
f76c39 A$nextPageNumber = 1; 
17acce A$nextFileNumber++; 
d9eefa A$result - 1; 

cOefe6 } 

e5af5a 

08817с print STDERR "Highest file number encountered: ", $nextFileNumber - 1, "Wn"; 
87af5a 

765e61 if ($fileIndexOpen >= 0) 

6cbb36 { 

fcc634 Aclose(FILE) || die; 

88асҒ0 A$fileIndexOpen = -1; 

6defe6 } 

54af5a 

fb7c70 exit($result) ; 

06af5a 

623601 # 

а1Ғ1а9 # vi: ai ts=4 

79887e # vim: si 

bba601 # 
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0b326b #!/usr/bin/env perl 

44a601 # 

1af8b2 # Yet another preprocessor 
39a601 # 

5d76de # $Id: yapp,v 1.5 1997/10/24 07:51:05 mhw Exp $ 
09a601 # 

27af5a 

е2а771 %vars = ('' => '$'); 
b19cla @incPath = (7.7), 

dcaf5a 

e32974 sub Error 

8ebb36 ( 

674е7Ғ Aprint STDERR $ [0], "Nn"; 
aa40c4 Aexit(1); 

22efe6 ) 

d7af5a 

7e022f sub VarSubst 

00bb36 { 

df06b5 Amy ($varName, $undefOkay) = @_; 
74af5a 

5da042 Aif (defined($vars{$varName})) 
964780 AL 

08cae0 AAreturn $vars{$varName}; 
938350 A} 

60c90b Aelsif (!$undefOkay) 
e5d780 A( 

a7dcd5 AA&Error("Undefined variable '$varName' in $fileName line 5."); 
df8350 A) 

b9efe6 } 

8baf5a 

d5faaa sub NullFilter 

f6bb36 ( 

f3ab2f A0; 

f6efe6 } 

03af5a 

e9cafc sub IfFilter 

d5bb36 ( 

f43508 Alocal $ = $ [0]; 

eQaf5a 

9ffee6 Aif (/*##else(\st+.*)?/) 
0cd780 At 

b16637 AAreturn 1; 

958350 A} 

82Ғ252 Aelsif (/*##endif(\st+.*)?/) 
e2d780 А( 

bb8953 AAreturn 2; 

398350 A} 

f7c252 Aelse 

38d780 АС 

903ceb AAreturn 0; 

438350 A) 

9aefe6 ) 

4faf5a 

49b322 sub DoFile 

d3bb36 { 

е77744 ----local $fileName = $ [0]; 
0b39e4 Amy $path; 

9c6da® Alocal *FILE; 

laaf5a 

acdbec Aif ($fileName =~ m|^/|) 
1cd780 А 

505ad9 AA$path = $fileName; 


058350 A} 
7ас252 Aelse 
2fd780 А 
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2579d5 
8b0751 
346d58 
1ac085 
75425b 
642e23 
719455 
f15381 
0e8350 
3d8fe3 
724780 
6b2a3e 
648350 
d2af5a 
£85095 
651dca 
59c634 
f6ab2f 
a3efe6 
30af5a 
@dd7db 
65bb36 
fb4555 
f7af5a 
5848e0 
9c40fd 
9e21d3 
360e1f 
70efe6 
49af5a 
11b89f 
05bb36 
04e1d8 
937ba4 
2fef41 
177de3 
d6f053 
cbaf5a 
1d2a4b 
5dd780 
2e3abb 
9d725b 
290751 
3f0fa8 
805381 
12fa3a 
аа0751 
7c6894 
89af5a 
c5966d 
56c085 
a9ce81 
2b51fd 
eQaf5a 
264e47 
3c5@eb 
034634 
24да03 
0562а7 
ac8ed3 
3db78c 
42da03 
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AAfor $dir (@incPath) 

АА 

AAAif (-e "$dir/$fileName") 
АЛА 

AAAA$path = "$dir/$fileName” ; 
AAAAlast; 

AAA} 

AA} 

A} 

Aif ($path eq "") 

At 

AA&Error("Can't find '$fileName', from $fileName line 5."); 


A) 


Aopen(FILE, "«$path") || &Error("Can't open $path: $!"); 
A&DoOpenFile(*FILE, *NullFilter, 0); 

Aclose(FILE) || die; 

AB; 

} 

sub DoPrepass 


{ 
Alocal ($_, $skipFlag) = 6; 


Areturn "" if /*###/; 

As/\sx###.*//; AAAAAAAA# Strip comments 
As/\$\{ Сињ) }/&VarSubst($1, $skipFlag)/eg; A# Do variable substitutions 
A$_; 

2 


sub DoOpenFile 

{ 

Alocal *FILE = $ [0]; 
Alocal «filter = $ [1]; 
Amy $skipFlag = $ [2]; 
Amy $result; 

Alocal $_; 


Awhile (<FILE>) 
А 


AA$_ = &DoPrepass($_, $skipFlag); 
AAif ($result = &filter($ )) 

ААС 

AAAreturn $result; 

AA} 

AAelsif (/*##(\w*) (\s+(.%))?/) 
ААС 

AAAmy ($cmd, $params) = ($1, $3); 


ААА: ($cmd =~ /^if/) 
АЛА 

AAAAny $condition; 
AAAAny $ifStartLine = $.; 


AAAAif ($cmd eq "if") 

АЛЛА 

AAAAAif ($params =~ /^(Nd*)Ns*$/) 

AAAAAL 

AAAAAAS$condition = int ($1); 

AAAAA} 

AAAAAelsif ($params =~ /*(\d+)\s*([=!]=|[<>]=?)\s*(\d+)\s*$/) 
AAAAAL 


bd1649 AAAAAAnmy ($left, фор, $right) = ($1, $2, $3); 
26af5a 
153#31 AAAAAAS$condition = eval($left . Фор . $right); 
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d28ed3 
db2712 
f6da03 
ас164д 
5caf5a 
daaQa7 
1622ab 
0295с2 
ba8ed3 
Ға91а2 
5fda03 
42912a 
b79610 
698ed3 
9a043b 
62e918 
d15@eb 
30700Ғ 
4здадз 
e03d11 
395586 
ad8ed3 
3491a2 
92da03 
даа 06 
де961д 
d78ed3 
1f043b 
1faf5a 
226633 
f5e348 
81738d 
ddaf5a 
df8f6f 
2b50eb 
5f4975 
690889 
3f095c 
8c043b 
4faf5a 
986d04 
3d50eb 
2ce2ee 
5b043b 
c12b8c 
ab50eb 
1f79fa 
f9958f 
eb043b 
499455 
b4eccf 
e9c085 
12870f 
b550eb 
7e043b 
6657bf 
f750eb 
ee5f3b 
acaf5a 
Ocf8f6 
b0043b 
d52f7d 
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AAAAA} 

AAAAAelsif ($params =~ /*(\S+)\s*(eq|ne)\s*(\S+)\s*$/) 
AAAAAL 

AAAAAAny ($left, Фор, $right) = ($1, $2, $3); 


AAAAAASIeft =~ $/([\\'1)/\\$1/5; 
AAAAAASright =~ s/([\\'])/\\$1/g; 
AAAAAAScondition = eval("'$left' Фор '$right'"); 
AAAAA} 

AAAAAelse 

AAAAAL 

AAAAAA&Error("Invalid ##if params: '$params' ” . 
AAAAAAA:-"in $fileName line $."); 

AAAAA} 

AAAA} 

AAAAelsif ($cmd =~ /*ifn?def$/) 

АЛЛА 

AAAAAif ($params =~ /^(Nw*)Ns*$/) 

AAAAAL 

AAAAAAScondition = defined($vars{$1}); 
AAAAAAScondition = !$condition if ($cmd eq "ifndef"); 
AAAAA} 

AAAAAelse 

AAAAAL 

AAAAAA&Error("Invalid ##$cmd param: '$params' " . 
AAAAAAA:-"in $fileName line $."); 

AAAAA} 

AAAA} 


AAAA# Do main body of if 
AAAA$result = &DoOpenFile(*FILE, *IfFilter, 
AAAAAAAAA · $skipFlag || !$condition); 


AAAAif ($result == 1)A# ап '##else’ was found 
АЛЛА 

ДАДАДЯ Handle else 

AAAAAS$result = &DoOpenFile(*FILE, *IfFilter, 
AAAAAAAAAA : $skipFlag || $condition); 
AAAA} 


AAAAif ($result == 1)A# a second '##else’ was found 
АААА( 

AAAAA&Error("Two ##else's in a row in $fileName line 8.7), 
AAAA} 

AAAAelsif ($result == 0)A# EOF was encountered 
АЛЛА 

AAAAA&Error("Unterminated ##if " . 

AAAAAA-:-:"in $fileName line $ifStartLine”); 
AAAA} 

AAA} 

AAAelsif ($cmd eq "include") 

АЛА 

AAAAif ($skipFlag) 

АЛЛА 

АЛАЛ} 

AAAAelsif ($params =~ /^"(. ж) "\5х$/) 

АЛЛА 

AAAAAmy $incFile = $1; 


AAAAA&DoFile($incFile); 
AAAA) 
AAAAelse 


8650eb ДАДА 
d6a3d2 AAAAABßError("Invalid ##include params: '$params'"); 
59043b АЛДА) 


--55bd 


e39455 
b49d24 
d3c085 
89df8c 
af373e 
3450eb 
602bd0 
89b59c 
b92bbb 
006445 
Ofaf5a 
e968d1 
6fda03 
de18a8 
бар ст 
668379 
50e557 
66e51f 
dcbf66 
9cb1cf 
8fdb4a 
48bdc4 
02f46b 
eae914 
201b5a 
f8e51f 
648ed3 
047999 
6ada03 
a9abef 
0d8ed3 
58043b 
70d5c8 
d8738c 
9250eb 
837999 
29da03 
14ea2e 
ec8ed3 
e4043b 
552f7d 
fd50eb 
aa0b85 
08043b 
299455 
062376 
c9c085 
2976bf 
9f9455 
865381 
1b4791 
0e0751 
21dc9b 
ae5381 
e38350 
87bced 
22efe6 
08af5a 
496086 
2faf5a 
4ea355 
78bb36 
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AAA} 

AAAelsif ($cmd eq "set") 

АЛА 

AAAAif ($params =~ /*(\wt)=<<(")(.*)"\s*$/ or 


AAAAAS$params =~ /*(\wt)=<<(')(.*)' 
АЛЛА 

AAAAAmy $varName = $1; 

AAAAAmy $quoteChar = $2; 
AAAAAmy $endTag = $3 . "Мп"; 
AAAAAmy $value; 


AAAAAwhile («FILE») 
AAAAAL 

AAAAAAif ($_ eq $endTag) 
ААААААС 

AAAAAAAchop $value; 
AAAAAAAlast; 
AAAAAA} 

АДАДАДе1ве 

ААААААС 

AAAAAAAIF ($quoteChar eq '"') 
AAAAAAAL 


\sx$/) 


AAAAAAAAS_ = 8DoPrepass($_, $skipFlag); 


AAAAAAA} 

AAAAAAA$value .= $_; 
AAAAAA} 

AAAAA} 

AAAAAif (!$skipFlag) 

AAAAAL 
AAAAAAS$vars($varName) = $value; 
AAAAA} 

AAAA} 


AAAAelsif ($params =~ /*(\wt)="(.*)"\s*$/ ог 
АЛАЛА: »$params =~ /*(\wt)=(\S*)\s*$/) 


АЛЛА 

AAAAAif (!$skipFlag) 
AAAAAL 
AAAAAAS$vars($1) = $2; 
AAAAA} 

AAAA} 

AAAAelse 

АЛЛА 


AAAAA&Error("Invalid ##set command: '$params'"); 


AAAA} 

AAA} 

AAAelse 

АЛА 
AAAA&Error("Unrecognized command: 
AAA} 

AA} 

AAelsif (!$skipFlag) 
АА 

AAAprint; 

AA} 

A} 

Areturn 0; 

2 


$optEnable = 1; 


foreach (@ARGV) 
( 


E уз 


7ad3b9 Aif ($optEnable and /^-/) 
044780 At 
f9087d AAif (/^--$/) 


--68bc 001c6abe38340010001 Page 5 of yapp.pl 


300751 ДАЧ 

02е86с AAA$optEnable = 0; 
795381 AA) 

ea3d19 AAelsif (/^-D(Nw*)2(.*)$/) 
ea0751 AA( 

f2327a АДД$уагз{$1} = $2; 
eb5381 AA} 

3e9424 AAelsif (/^-I(.*)$/) 
870751 AA( 

e31766 AAAunshift @incPath, $1; 
9b5381 AA} 

1648ba AAelse 

470751 AA( 

465388 AAA&Error("Unrecognized option: '$ '"); 
4е5381 AA} 

468350 A} 

8ec252 Aelse 

18d780 АС 

7189f4 AA8DoFile($_); 

£78350 A} 

2befe6 } 

2daf5a 

8раб01 # 

f7f1a9 # vi: ai ts=4 

5d887e # vim: si 

c6a601 # 
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bc38e5 
celeec 


d9775e · 


84a5ec 
a2775e 
73fbf7 
8a775e 


7374f0 - 
42495d · 


faaf5a 
5338e5 
0a493a 
003525 
42cb44 
15495d 
4d47fd 
07af5a 
7338e5 


4457da : 
a239fb - 
7be8c2 · 


751e3c 


325d9b · 
622afd - 
7c3fce - 
eb495d - 


128240 
abaf5a 
b438e5 


3975c5 · 
66a09d · 


cc4983 


498209 - 


b56fc2 
9d775e 
2a3886 
72c7e7 


a8a9df - 


51495d 
1c298b 
e2af5a 
61ес10 
40f03e 
86af5a 
d3f00e 
f196c7 
a71df2 
b2609b 
a124f7 
ссб819 
9c642d 
3236ea 
7cd3ef 
a31f68 
f5af5a 
634463 
да2ер9 
4а3225 
41ceb5 
391efc 
2Ғ622с 
0d4f87 


$Id: 


/ж 
** Give 


subst.h -- Header for repair substitutions 
Copyright (C) 1997 Pretty Good Privacy, Inc. 


Written by Colin Plumb 


subst.h,v 1.9 1997/11/03 22:12:00 colin Exp $ 


up if the list of pending changes to attempt grows to this many 


-ж elements. ‘Each element is 32 bytes, so 128K is 8 MB of memory. 
** (Other than this, repair's memory usage is fairly modest.) 


ЕЭ) 
#define 


cost 


Edit 


#define 


NS 
* 


This 


this 
line 


This 
that 
much 


ххх ж ж ж ж ж ж 


-х/ 
#define 


МАХ НЕАР (1<<17) 


* There is a hack in the code to find а single substitution that will fix а 
* line, 
* probation”, with an infinite cost, and if it leads to a successful 
** correction of the entire page, is "learned" for future use and its 
* 
* 
* 


even if it's not in the tables. -It gets added to the tables "on 


reduced to something finite. 


(This is not remembered across runs of the program, though. 


the tables in the source to fix it.) 


DYNAMIC. COST. LEARNED 15 


negative-cost bonus for passing the end of a line with the right 


CRC makes the search engine reluctant to backtrack past a correct CRC, 
greatly improving efficiency. :It's rather a hack, though. -Think of 


in terms of "how many errors should be considered in the current 
before considering the possibility of errors in the previous line?" 


bonus is halved for lines that are the result of a correction 
was computed from the checksum, since a correct checksum is 


less significant in such a case. 


COST. LINE -30 


/* The cost of a full-line nastyline substitution. */ 


#define 


/х Type 


NASTY_COST 5 


describing filter functions used in substitutions %/ 


struct ParseNode; 
struct Substitution; 
#include "heap.h" 


typedef HeapCost FilterFunc(struct ParseNode xparent, char const xlimit, 
Astruct Substitution const *subst); 

FilterFunc TabFilter, --::--::-.:::. FilterFollowsSpace, FilterNearBlanks; 
FilterFunc FilterNearUpper, ::::::: FilterNearLower, ---FilterNearXxDigit; 
FilterFunc FilterAfterRepeat, · · · · · FilterCharConst, ---FilterChecksumFollows; 
FilterFunc FilterLikelyUnderscore, FilterIsDynamic, ---FilterIsBinary; 


/х The external substitution format х/ 


typedef 


struct RawSubst { 


Achar const *input; 
Achar const *output; 
AHeapCost cost, cost2; 
AFilterFunc «filter; 

} RawSubst; 


2caf5a 
8ea94f /х The substitutions to make */ 
f1a820 extern struct RawSubst const substSingles[]; 
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13f149 extern struct RawSubst const substMultiples[]; 


--3dd® 


bc38e5 
141395 
Ғс0р95 
513169 
68775е 


57ађес · 
59775е · 
573157 · 


00197b9d49440010001 Page 1 of heap.h 


/ж 
-ж 
-x 


** them in increasing order of cost. 


“ж 


* 


b4775e -х 


106d04 - 


9a495d 
43af5a 
34р51е 
7411ad 
4daf5a 
71Ғер2 
51bea3 
ддатад 
f4af5a 
efcc46 
58eee4 
d21fbd 
adaf5a 
1754f2 
5f30c2 
7646df 
b20d9e 
f6af5a 
аҒ9р92 
2bb260 
227fef 
ba90f5 
abe605 
42af5a 
b67454 
74af5a 
9Ғ38е5 
91е6с5 
dc9b19 
dee7a4 


a3b612 · 
ca6c42 - 
97495d · 


* 


.ж/ 


heap.h -- Simple priority queue. 


-Takes pointers to cost values 


(presumably the first field in a larger structure) and returns 


Copyright (C) 1997 Pretty Good Privacy, Inc. 


Written by Colin Plumb and Mark H. Weaver 


$Id: heap.h,v 1.6 1997/10/31 04:22:46 mhw Exp $ 


#ifndef HEAP_H 
#define HEAP_H 1 


#include <stdio.h> 
#include <stdlib.h> 
#include <limits.h> 


typedef int HeapCost; 
#define COST_INFINITY INT_MAX 
typedef unsigned HeapIndex; 


typedef struct Heap { 
AHeapCostAxxelenms; 


AHeapIndexAnumElems, elemsAllocated; 


үн 


eap; 


void HeapInit(Heap хћеар, HeapIndex initSize); 


void HeapDestroy(Heap хћеар); 


void HeapInsert(Heap хћеар, HeapCost *newElem) ; 


HeapCost *HeapGetMin(Heap хћеар); 
void HeapVerify(Heap *heap); 


#en 


/ж 


** Local Variables: 


"Ж 


"Ж 


dif 


tab-width: 4 


End: 


i: ts=4 sw=4 


--cflc 


bc38e5 
a5e984 


01775e : 
8рабес : 


2e775e 


e77659 · 
b4775e · 
b34b9a · 


3d495d 
78af5a 
6466b6 
2Ғ9484 
52af5a 
755f9b 
2a867d 
291134 
a2af5a 
а81а19 
345416 
ттар1с 
73af5a 
2c0ec7 
ab4e3e 
f1e794 
2665e6 
590f2c 
c9c6c3 
d72604 
1ас17а 
56484d 
f450c1 
24af5a 
a451e4 
52af5a 
a7ae22 
дрсбе1 
ea4elb 
81af5a 
8ef9e4 
edd450 
e4af5a 
2d8495 
9с9сед 
141779 
14af5a 
371741 
3abb36 
dc6f64 
e947d7 
дсврбт 
256с4Ғ 
e74b93 
1daf5a 
70c568 
e9bb36 
bcb900 
5c5320 
857d36 
1е1даа 
b6bbba 
e990d7 
7dcf68 
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/ 


util.h -- Miscellaneous defines 
Copyright (C) 1997 Pretty Good Privacy, Inc. 


* 
* 
* 
sw 
x Written by Mark H. Weaver 

ж 

ж $Id: util.h,v 1.23 1997/11/12 23:28:56 mhw Exp $ 
«ж/ 


#ifndef UTIL_H 
#define UTIL_H 1 


typedef unsigned longAword32; 
typedef unsigned shortAword16; 
typedef unsigned charAbyte; 


#define FMT32A"%081x" 
#define FMT16A"%04x" 
#define FMT8A"%02x" 


#define TAB CHARAA'N244' A/x Currency symbol, like o in top of x */ 
#define TAB. STRINGAA"xX244" 

#define TAB PAD CHARA' 'AA/x The fact that this is space has leaked. */ 
#define TAB PAD STRINGA" "AA/x It may not be freely changed. */ 
#define FORMFEED. CHARA 'N245' A/* Yen symbol, like = on top of Y х/ 
#define FORMFEED STRINGA "1245" 

#define SPACE CHARAA'N267'A/* Middle dot, or bullet х/ 

#define SPACE STRINGA"X267" 

#define CONTIN_CHARAA'\266'A/* Pilcrow (paragraph symbol) */ 

#define СОМТІМ STRINGA "1266" 


#define BYTES. PER LINEA60AA/x When using radix 64 х/ 

#define LINES PER PAGEA65AA/* Exclusive of 2 header lines */ 
#define LINE LENGTHAA95 

#define PREFIX LENGTHA7AA/* Length of prefix, including the space */ 


#define HDR_PREFIX_CHARAA ' -' 
#define RADIX64, ЕМО CHARA ' -' 


typedef struct EncodeFormatA AEncodeFormat ; 
typedef word32AAAAACRC; 
typedef word16AAAAACRCFragment; 


typedef struct 

{ 
ACRCAAAtable[256]; 
AintAAAbits; 
ACRCAAApoly; 
ACRCAAAhighBit; 

} CRCPoly; 


struct EncodeFormat 

{ 

AEncodeFormat const *nextFormat; 
Achar AAA AheaderTypeChar ; 
Achar const *AAdigits; 

Asigned char const *AdigitsInv; 
AintAAAAAbitsPerDigit; 
AintAAAAAradix; 

ACRCPoly const *AAlineCRC; 


ceb92b ACRCPolyAconst *AApageCRC; 
74215b AintAAAAArunningCRCBits; 
16eb13 AintAAAAArunningCRCShift ; 
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db5284 AintAAAAArunningCRCMask; 

bc82f7 }; 

3faf5a 

d2af5a 

766618 #define HDR ЕМС LENGTHAA19AA/* Length of encoded prefix on header х/ 
feaf5a 
01Ғ954 #define 
2c9cfc #define 
7d92ac /х Page 
ala5e6 #define 
65f2e5 #define 
2ca525 #define 
85af5a 

b2af5a 

де8332 /х Enough to hold one whole page of munged data х/ 

618044 /х There 15 no point making this excessively too large х/ 

3285f9 #define PAGE_BUFFER_SIZEA8192 

76af5a 

f7b97c #if PAGE BUFFER SIZE < (LINES PER PAGE + 2) x (LINE LENGTH + PREFIX LENGTH + 2) 
25bd5c #error PAGE BUFFER SIZE is too small 

8a7454 #endif 

30af5a 

23af5a 

9cd8fa /* Header flags х/ 

977110 #define HDR FLAG LASTPAGEA0Ox01A/* Indicates last page of file х/ 

34af5a 

f6af5a 

491555 #define elemsof(array) (sizeof(array)/sizeof(*(array))) 

ddaf5a 

47af5a 

96625e extern char constAhexDigits[]; 

3821cf extern char constAradix64Digits[]; 

3daf5a 

d752d5 extern signed charAhexDigitsInv[256]; 

46дсдс extern signed charAradix64DigitsInv[256]; 

cdaf5a 

939892 extern CRCPolyAAcrcCCITTPoly, crc24Poly, crc32Poly; 

8aaf5a 

680318 extern EncodeFormat constAAhexFormat, radix64Format; 

c59674 extern EncodeFormat const *AAfirstFormat; 

99af5a 

91af5a 

41499e #define HexDigitValue(ch) AAhexDigitsInv[ (byte) (ch) ] 

646340 #define Radix64DigitValue(ch) Aradix64DigitsInv[ (byte) (ch) ] 

80af5a 
2018f1 /х Returns the number of chars needed to encode the given number of bits х/ 
40е963 #define EncodedLength(fmt, numBits)A\ 

23fada AA(((numBits) + (fmt)->bitsPerDigit - 1) / (fmt)->bitsPerDigit) 

380сс2 #define EncodeDigit(fmt, value) AA((fmt)->digits[value]) 

аза7ст #define DecodeDigit(fmt, digit) AA((fmt)->digitsInv[ (byte)digit]) 

bcaf5a 
de750e #define AdvanceCRC(poly, crc, b)A\ 

d5489a AA((crc) >> 8) ^ (poly)->table[((crc) ^ (b)) & OxFF] 
7аа#5а 
446440 #define RunningCRCFromPageCRC(fmt, равеСЕС) АХ 

845а50 AA(((pageCRC) >> (fmt)->runningCRCShift) & (fmt)-»runningCRCMask) 
35af5a 

1daf5a 

50df6f CRC CalculateCRC(CRCPoly const xpoly, CRC crc, 

cd731f AAAAbyte const *buffer, size t length); 

6d2245 CRC ReverseCRC(CRCPoly const xpoly, CRC crc, byte b); 

a2af5a 


DR_VERSION_BITSA4 

DR_FLAG_BITSAA8 

RC bits omitted, since it's not constant х/ 
DR_TABWIDTH_BITSA4 

DR_PRODNUM_BITSA12 

DR_FILENUM_BITSA16 


а11зад /х Returns the number of chars encoded х/ 
114b67 int EncodeCheckDigits(EncodeFormat const xfmt, word32 num, 
830489 AAAAA-int numBits, char *dest); 


--29ba 00181caeb7440010001 Page 3 of util.h 


eQaf5a 

589352 /* Returns 1 if there's an error x/ 

72f54b int DecodeCheckDigits(EncodeFormat const xfmt, char const *src, char xxendPtr, 
c8dab2 AAAAA-int numBits, word32 *valuePtr); 

b@af5a 

966bfa EncodeFormat const *FindFormat(char headerTypeChar) ; 
8aaf5a 

4276f6 void InitUtil(); 

abaf5a 

Qaaf5a 

7ad5be #endif /x !UTIL_H х/ 

1caf5a 

5138е5 /х 

c5e6c5 «х Local Variables: 

089019 -х tab-width: 4 

4ee7a4 -х End: 

82b612 -* vi: ts=4 sw-4 

396с42 -* vim: si 

81495d :*/ 


--7bea 


d29399 
6daf5a 
fbb718 
706a41 
a4af5a 
8e853a 
095b8b 
63834e 
a0577f 
422d88 
8db5a6 
7b89f3 
c43575 
6c9d87 
ecaf5a 
5d885d 
527b5e 
dcaf5a 
df38e5 
5df456 
37495d 
9а1ас4 
а5159е 
7Ғ46с7 
86cff6 
e0b125 
da57de 
26c693 
5e7454 
дсаҒ5а 
aeda72 
edcbc2 
3eaf5a 
47e554 
71af5a 
40c550 
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/* $Id: mempool.h,v 1.2 1997/11/13 23:53:09 colin Exp $ х/ 


#ifndef MEMPOOL H 
#define MEMPOOL H 


typedef struct MemPool { 

Astruct PoolBuf xhead; 

Achar *freeptr; 

Aunsigned freespace; 

Aunsigned сһипквіге; А/ж Default starting point */ 
Aunsigned long totalsize; 

Aint (*purge) (void *);A/* Return non-zero to retry alloc х/ 
Avoid *purgearg; 

} MemPool; 


/* A global pool for miscellaneous stuff. х/ 
extern struct MemPool MiscPool; 


/ж 

** Nice clean interfaces 

-жх/ 

void memPoollnit(struct MemPool xpool); 

void memPoolSetPurge(struct MemPool хроо1, int (*purge)(void ж), void xarg); 
void memPoolEmpty(struct MemPool хроо1); 

void memPoolCutBack(struct MemPool xdest, struct MemPool const xcutback); 
void *memPoolAlloc(struct MemPool xpool, unsigned len, unsigned alignment); 
#ifdef DEADCODE 

char const xmemPoolStore(struct MemPool xpool, char const *str); 

#endif 


/х Lookie here! “Ап ASNI-compliant alignment finder! х/ 
#define alignof(type) (sizeof(struct{type _x; char _y;}) - sizeof(type)) 


#define memPoolNew(pool, type) memPoolAlloc(pool, sizeof(type), alignof(type)) 


#endif /х MEMPOOL_H %/ 


--ca45 


bc38e5 
1479cc 
f790e9 
c6e543 
9986e3 


76775e · 


cfee77 


098dle - 


Ғдс1ае 


b6db5c · 


5cblae 
43d449 
бе775е 
cfadb7 


0bb038 - 


33465c 


74d16a · 


9d1dd@ 


de5fd8 · 


d8a4b1 


1b132f - 
69852f · 


59775e 


ддађес · 


82775е 


e7bd99 · 


19fbf7 
95775e 


6f8237 - 
51495d - 


27af5a 
5cbb5f 
cdfeb2 
6f324c 
@2b1cb 
13495d 
@db16e 
07af5a 
3a490a 
ea609b 
07eb10 
84d04a 
58af5a 
4538e5 
766222 
903fbc 
7a495d 
9dfb7d 
6b04f0 
985547 
5ddc4b 
f5lefc 
2f622c 
03b73f 
d@b5@c 
4faf5a 
4с09ра 
c7af5a 
3e38e5 
cc4c9b 
5Ғ6059 
75579Ғ 
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[x 

** repair.c -- Program which reconstructs scanned source, locates errors, 
"ХАДА «ааа tries to fix most of them automatically. -If it 

E E can't, it drops you into an editor on the appropriate 

HR See DE des line for manual correction. 

ж 

** Given a file "foo", this appends corrected output to "foo.out" 

* and copies remaining uncorrected input in "foo.in". -If "foo.in" 

** exists initially, "foo" is ignored and only "foo.in" is processed. 

* Thus, re-running it repeatedly, possibly with other correction 

** techniques in between, will result in correct output in "foo.out" 

** and an empty "foo.in" file. 

ж 

** This can automatically invoke an editor for you on the .іп file 

* and re-run itself. ·Тће editor is chosen in the first available way: 
** - The -e command-line argument takes a printf() format string to 

x --format the editor invocation command line with the line number апа 
ж --filename. -E.g. "emacs +%u 25". -%u and Xs must appear, in that order. 
* - Failing that, the default is "$VISUAL +%и %5" 

-x - Failing that, the default is "$EDITOR +%u 95" 

* - Failing that, the program prints the error location and exits. 

ж >-5ресіҒуіпр -e- forces this behaviour. 

"ж 

х Copyright (С) 1997 Pretty Good Privacy, Inc. 

v 

* Designed by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann 

** Written by Colin Plumb 

“ж 

х $Id: repair.c,v 1.37 1997/11/14 08:39:40 mhw Exp $ 

ж/ 

#include <assert.h> 


#include <stdio.h> 
#include <string.h> 
#include <ctype.h> 
#include <errno.h> 
#include <signal.h> 


#include "util.h" 
#include "ћеар.ћ" 
#include "mempool.h” 
#include "subst.h" 


/ж 

** The internal form of a substitution. ·Тћезе are stored on 

-ж lists indexed by the first character of the input substitution. 
-х/ 

typedef struct Substitution { 


Astruct Substitution *next; 

Achar const *input, *output; 

Asize t inlen, outlen; 

AHeapCost cost, cost2; 

AFilterFunc *filter; 

Aunsigned int index; А/ж Consecutive serial numbers */ 
} Substitution; 


struct Substitution const substNull = ( NULL, "", "", 0, 0, 0, 0, @ }; 
/ж 

ж This might get increased later to support multiple classes of 

-x substitutions, for different contexts. ‘Currently, only one 


** is used. 


b1495d :*/ 
10bf86 #define SUBST CLASSES 1 
01af5a 


--8b3e 


d85dbc 
d88f9c 
44af5a 
7938e5 
5278c5 
301dcf 
35495d 
3e542a 
с23053 
ffd090 
cdd161 
2cb806 
а1Ғ319 
d732f4 
с82657 
434410 
e4af5a 
14c88d 
50f27e 
325596 
68af5a 
7841a9 
982978 
455fdb 
5abb36 
albb19 
79af5a 
8b9972 
9660b2 
19bbcc 
35c94d 
02911c 
62ce23 
6eaf5a 
160е37 
b3cd60 
c2cf6b 
578350 
8eefe6 
75af5a 
a638e5 
с99701 
089377 


2c0240 · 
270336 · 


e5495d 
14ed99 
c5bd82 
20bb36 
b41664 
fcaf5a 
c4f9aa 
8941ес 
1а4еа1 
еа087е 
5са4р2 
cd8350 
82деда 
abefe6 
96af5a 
{22978 
f5fbc2 
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/* List of substitutions, indexed by first character, plus a catch-all х/ 
Substitution *substitutions[SUBST. CLASSES][0x101]; 


/ж 

** The pool of Substitution structures. -Remains alive for the entire 
** execution of the program. 

-ж/ 

static MemPool substPool; 

static Substitution *substFree; 

static unsigned int substCount = 1; А/х Preallcoate 0 to substNull х/ 
static unsigned int substFirstDynamic; 

#define SubstIsDynamic(s) ((s)->index >= substFirstDynamic) 

/* Adjust the substitution based on noccurrences this page */ 
#define SubstAdjust(s,n) ((s)->cost = (s)-»cost2) 

/* Is this a nasty-line substitution? */ 

#define SubstIsNasty(s) ((s)->cost2 == COST INFINITY) 


/* Every possible single-character string */ 
static char substChars[512]; 
#define SubstString(c) (substChars+2*((c)&255)) 


/х Set the list of substitutions to empty х/ 
static void 

SubstInit (void) 

{ 


Aunsigned int i, j; 


AmemPoolInit(&substPool); 

AsubstFree = ®; 

AsubstCount = 1;A/* Number zero is reserved for uncounted substitutions */ 
Afor (i = 0; i < elemsof (substitutions); i++) 

AAfor (j = 0; j < elemsof(*substitutions); j++) 

AAAsubstitutions[i][j] = NULL; 


Afor (i = 0; i < 256; i++) { 


AAsubstChars[2*i] = (char)i; 

AAsubstChars[2*i+1] = 0; 

A} 

у 

/ж 

** For dynamically allocated substitutions, we maintain a free list. 
** Each substitution has a unique serial number. ·Тћезе are retained 
* if a substitution goes on the free list, to keep substCount from 
* ratcheting upwards indefinitely while still guaranteeing uniqueness. 
-*/ 

static Substitution Ж 

SubstAlloc(void) 

( 

Astruct Substitution *subst = substFree; 


Aif (subst) { 

AAsubstFree = subst->next; 

A} else { 

AAsubst = memPoolNew(&substPool, Substitution); 
AAsubst->index = substCount++; 

A} 

Areturn subst; 


} 


static void 
SubstFree(Substitution *subst) 


13bb36 { 
d31c64 Asubst->next = substFree; 
46934e AsubstFree = subst; 
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ТаеҒеб } 

46af5a 

7aed99 static Substitution Ж 

416206 MakeSubst(char const *input, char const xoutput, HeapCost cost, HeapCost cost2, 
5842c7 AFilterFunc *filter, int class) 

64bb36 { 

3fc0a2 Astruct Substitution *subst, жхћеад; 

laaf5a 

43dc30 Asubst = SubstAlloc(); 

9d9d91 Asubst->input = input; 

9f1bd1 Asubst->output = output; 

119977 Asubst->inlen = strlen(input); 

1bbf7a Asubst-»outlen = strlen(output); 

7be37f Asubst->cost = cost; 

19ad29 Asubst->cost2 = cost2; 

5efc6a Asubst->filter = filter; 

5faf5a 

65e834 Д/х 

а05176 Ах Ignore certain substitutions when printing stats. 
c46b0c Ах Identity substitutions, and the tab/space tweaking. 


ab3767 Ax/ 
Ғе454Ғ Aif (strcmp(input, output) == 0 || strcmp(input, TAB STRING) == 0 || 
54f51b ДА(апри [0] == ' ' 88 input[1] == 0 8% output[0] == 0)) 4 


20f03c AAAif (subst->index == substCount-1) 

ebc429 AAAAsubstCount--; 

6af9bf AAAsubst->index = 0; A/* Evil hack х/ 

5а8350 A} 

dfaf5a 

9b8fb7 Ahead = &substitutions[class][input[class] & 255]; 
3c3357 Asubst->next = xhead; 

232e49 Axhead = subst; 

26de0d Areturn subst; 

2befe6 } 

33af5a 

ed38e5 /x 

8Ғ7544 -х For each entry in the raw array, turn ( "abc", "def", 5" } 
b36078 -x into cost-5 mappings of "a”->"d", "b"-2"e" and "c"-»"f", 
909114 -х If the output string is NULL, the characters are deleted. 
974863 -х An input string of NULL is the end of table delimiter. 
8f495d :*/ 

192978 static void 

875417 SubstSingle(struct RawSubst const хгам, int class) 

4bbb36 ( 

765547 Achar const *input, *output; 

5b2945 Aint i, о; 

6faf5a 

e4al6d Awhile (raw->input) { 

3475b5 AAinput = raw->input; 

a5a96f AAoutput = raw->output; 

c3dd59 AAassert(loutput || strlen(input) == strlen(output)); 
52af5a 

25e66b AAwhile (*input) 4 

524250 AAAi = *input-*; 

3cb448 AAAo = output 2 *output++ : 0; 

a7b3ec AAA(void)MakeSubst(SubstString(i), SubstString(o), 
649b63 AAAAAAAraw-»cost, raw-»cost2, raw-»filter, class); 
5е5381 AA} 

757f0e АДгам++; 

538350 A} 

9aefe6 ) 

aQaf5a 

0838e5 /х 


f97544 -х For each entry in the raw array, turn ( "abc", "def", 5" } 
8daf86 -* into a cost-5 mappings of "abc"-»"def". 
484a63 :* An input string of NULL is the end of table delimiter. 
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47495а 
512978 
р01ссс 
820036 
6cal6d 
4сдер6 
931080 
8а7Ғде 
418350 
edefe6 
4aaf5a 
2с7с4е 
312978 
1674а0 
d2bb36 
75797Ғ 
5baef6 
e88fc5 
0c09a9 
Ofefe6 
37af5a 
ad38e5 
3d0457 
af495d 
d04480 
3e31d3 
ca0640 
29bb36 
d1964a 
faaf5a 
f2d438 
175752 
2402Ғ4 
cf7247 
a34666 
7a8350 
245а10 
3cefe6 
bcaf5a 
88af5a 
de38e5 
128e01 
655811 
32495d 
684480 
d9b434 
54bb36 
b67da9 
e5af5a 
6df5ed 
875528 
9c9780 
92еҒеб 
d3af5a 
6d38e5 
9ce4b0 
996c38 
0337c5 
5c495d 
f34480 
1ее5Ғ6 
32bb36 
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«ж/ 

static void 

SubstMultiple(struct RawSubst const хгам, int class) 

{ 

Awhile (raw->input) { 

AA (void)MakeSubst(raw->input, raw->output, raw->cost, raw->cost2, 


AAAAA:::raw-»filter, class); 
AAraw++ ; 

A} 

} 

/х Build the substitutions table х/ 


static void 

SubstBuild(void) 

{ 

ASubstInit(); 
ASubstSingle(substSingles, 0); 
ASubstMultiple(substMultiples, 0); 
AsubstFirstDynamic = substCount; 


) 


/ж 

ж See if the desired substitution already exists 

-ж/ 

static Substitution const х 

SubstSearch(char const жіп, size t inlen, char const *out, size t outlen, 
Aint class) 

( 

ASubstitution *subst = substitutions[class][in[0] & 2551; 


Afor (; subst; subst = subst->next) { 

AAif (subst->inlen == inlen && subst->outlen == outlen && 
AAAmemcmp(subst->input, іп, inlen) == 0 88 
AAAmemcmp(subst->output, out, ош еп) == 0) 
AAAAreturn subst; A/* Already exists */ 


A} 

Areturn NULL; 

} 

/ж 

** Create a new dynamic substitution. -First search to make 
** sure it doesn't already esist. 

-ж/ 

static Substitution const х 


SubstDynamic(char const *in, char const *out, int class) 


{ 


ASubstitution const *subst; 


Asubst = SubstSearch(in, strlen(in), out, strlen(out), class); 
Areturn subst ? subst : MakeSubst(in, out, COST_INFINITY, 
AAAAAAAAADYNAMIC_COST_LEARNED, NULL, class); 


} 

/ж 

** Search for the substitution, allocating one if not found. 

** the input string is not null-terminated and needs to be copied to 
** an allocated buffer. -The output string can just be pointer-copied. 
«ж/ 

static Substitution const х 


SubstNasty(char const жіп, size t inlen, char const *out, int class) 


{ 


187da9 ASubstitution const *subst; 
c6751e Achar xstring; 
2eaf5a 
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a0f38c 
c9997b 
23af5a 
587c63 
1829Ғс 
2c6e00 
618350 
дараа7 
4a6e5d 
131d50 
БаеҒеб 
17af5a 
a438e5 
893Ғ60 
аа4дад 
aec5b1 
448489 
9b6579 
39495d 
92754b 
d2d71d 
704589 
2b8a3d 
27f6de 
claf5a 
e3f645 
45925a 
bbd107 
8a2bb® 
143b11 
0f62e6 
6f969d 
cdab16 
c64b40 
836372 
eebd5f 
6925e2 
b77d62 
6caf5a 
5f78f7 
5d1747 
86d72f 
Qeaf5a 
а014ед 
85ddb2 
eeaf5a 
2d1cb6 
07af5a 
61fbc2 
98beb7 
52b084 
c1bb36 
f8e3d0 
54ffad 
1b2c41 
1946a0 
29е11е 
0204с1 
0046a0 
995381 
bc8350 
f89030 
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Aif ((subst = SubstSearch(in, inlen, out, strlen(out), class)) != NULL) 
AAreturn subst; 


Aif (!(string = malloc(inlen*1))) { 


AAfputs("Out of memory!\n", stderr); 
AAexit(1); 

A} 

Amemcpy(string, in, inlen); 


Astringlinlen] = 0; 
Areturn MakeSubst(string, out, COST_INFINITY, COST_INFINITY, NULL, class); 
} 


/ж 

** The state of the parser. 

** Note that this is updated when а ParseNode 15 хгетоуеах from the heap; 
-ж ParseNodes that аге in the heap һауе ParseStates that reflect the 

ж state before the substitution has been parsed; this is а copy of the 
** parents’ state, which is after the parsing. 

-ж/ 

typedef struct ParseState { 

ACRC page crc; AAA/* Computed per-page CRC */ 

Aword16 flags; AAA/* Flags; see below х/ 

Aunsigned char роз; АД/х Position on the line х/ 

} ParseState; A/* 7 bytes, rounded to 8 х/ 


/* Flags values х/ 

#define PS MASK PAGENUMA 0xC000 /х Digits in header page number (1..3) */ 
#define PS 5НІҒТ. PAGENUMA 14A/x Shift for the above х/ 

#define PS FLAG EOLAAAS512A/x Expect “п next */ 

#define PS FLAG SPACEAA256A/* Was last char a space? */ 

#define PS FLAG, TABAAA128A/* Tabbing over a column */ 

#define PS FLAG INHEADERA 64A/* Current line is a header х/ 

#define PS FLAG PASTHEADERA 32Д/х A previous line was a header х/ 
#define Р5 FLAG BINWSAA 16A/x In whitespace after binary data х/ 
#define Р5 FLAG BINENDAA -8A/x End of binary data */ 

#define PS FLAG DYNAMICAA -4A/x Have used ECC this line х/ 

#define PS MASK FORMATA A-3A/x The encoding format (max of 3, for now) х/ 
#define PS 5НІҒТ. FORMATAA -@A/x Shift for the above х/ 


/* Have we started on a second page? -Used to force flushing of the first. х/ 
#define InSecondHeader(ps) \ 
A((~(ps)->flags & (PS_FLAG_INHEADER | PS_FLAG_PASTHEADER)) == 0) 


#define PageNumDigits(pn) (((pn)->ps.flags & PS_MASK_PAGENUM) >> PS_SHIFT_PAGENUM) 
#define PageNumDigitsIncrement(pn) ((pn)->ps.flags += 1<<PS_SHIFT_PAGENUM) 


EncodeFormat const xregisteredFormats[4]; 


/* Returns a small integer index х/ 

static int 

registerFormat(EncodeFormat const *format) 

( 

Aint i; 

Afor (i = 0; i < (int)elemsof(registeredFormats); i++) { 
AAif (registeredFormats[i] == format) 

AAAreturn i; 

AAif (lregisteredFormats[i]) 4 
AAAregisteredFormats[i] = format; 

AAAreturn 1; 

AA} 

A} 

Afputs("Registered formats table overflow!\n”, stderr); 


6Ғ40с4 Aexit(1); 
b5efe6 } 
8caf5a 
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5b995a #define psFormat(ps) registeredFormats[((ps)-»flags & PS_MASK_FORMAT)>>PS_SHIFT_FORMAT | 
37f419 #define pnFormat(pn) psFormat(&(pn)-»ps) 

fa3adc #define psSetFormat(ps, i) \ 

f74ed6 A((ps)->flags = ((ps)-»flags & "Р5 MASK FORMAT) | i << PS. SHIFT. FORMAT) 
39af5a 

f49bdb typedef struct ParseNode { 

4cdbca AHeapCost cost; 

3cae67 Aunsigned int refcnt; 

0c8fca Astruct ParseNode xparent; 

2a3225 Achar const *input; 

30с1Ға Astruct Substitution const *subst; 

7f356e Astruct ParseState ps; 

3b7775 У ParseNode; A/* 32 bytes х/ 

blaf5a 

109096 /х A handle for walking backwards through the output stream */ 
be236f typedef struct OutputHandle { 

78be96 AParseNode const *node; 

5bceb5 Achar const *output; 

3002а8 Aunsigned int pos; 

c15a32 } OutputHandle; 

b4af5a 

236eeb /х Initialize the handle to point to a node (optionally, a position therein) х/ 
602978 static void 

444978 OutputInit(OutputHandle хоп, ParseNode const хподе, char const хр) 
66bb36 { 

ebf@2d AAoh->node = node; 

ef8717 AAoh-»output = p 2 p : node->subst->output + node->subst->outlen; 
5159е4 AAoh->pos = 0; 

89efe6 } 

46af5a 

05d6ae /* Get the xpreviousx byte х/ 

acbeb7 static int 

b256f0 OutputGetPrev(OutputHandle хоћ) 

e3bb36 { 

abd80e Aif (!оћ->поде) 

0e6295 AAreturn -1; 

e9c3b2 Afor (;;) { 

cb8db7 AAif (oh-»output != oh->node->subst->output) { 

0c8ee2 AAAoh->pos++; 

Ғда15с AAAreturn *--oh->output 4 255; 

fc5381 AA} 

61с78Ғ AAoh-»node = oh->node->parent; 

23cef8 AAif (!oh->node) 

465472 AA Abreak; 

b96d8f AAoh->output = oh->node->subst->output + oh->node->subst->outlen; 
3e8350 A} 

1e0723 Areturn -1; 

26efe6 } 

08af5a 

2497d7 /* Return the character just before the node - trivial handy wrapper х/ 
92beb7 static int 

32447a OutputPrevChar(ParseNode const хподе) 

b1bb36 { 

be589c AOutputHandle oh; 

alaf5a 

8e6b9f AOutputInit(&oh, node, NULL); 

25ee98 Areturn OutputGetPrev(&oh) ; 

Qaefe6 } 

c6af5a 

3538e5 /х 

51286b -* Unget the last retrieved character (and return it), or 

cfe03c -х -1 if that is impossible. -At least one character is 


012898 -x always ungettable, but after that you're on your own. 
73495d -х/ 
debeb7 static int 


--0fe7 


c633ad 
b9bb36 
76ffb4 
79c379 
a8a743 
868350 
860723 
29efe6 
23af5a 
6acaf8 
544185 
89af5a 
e738e5 
ро4дағ 
bb70e8 
68495d 
39083f 
decaf1 
7cbb36 
c490aa 
43af5a 
5e8758 
557925 
784883 
10efe6 
c8af5a 
e638e5 
27bce8 
acf3b2 
f4495d 
d3d2d1 
8e4005 
d7cf30 
7217e5 
86b9f9 
54785d 
8e705a 
35db82 
a4af5a 
cb5e82 
64af5a 
762978 
с71315 
08bb36 
ab7800 
eeb9a2 
6dd52d 
67286e 
c6efe6 
46af5a 
7a38e5 
eb540e 
5d7ae7 
c4a3ef 
4b495d 
048329 
66778e 
5967db 
0fbb36 
95eba9 
2d589c 
7baf5a 
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OutputUnget(OutputHandle хой) 

{ 

Aif (oh->node 88 *oh->output) { 
AAoh->pos--; 

AAreturn xoh->output++ 8 255; 
A} 

Areturn -1; 


} 


/х The position is useful for comparing two OutputHandles. х/ 
#define OutputPos(oh) ((oh)->pos) 


/ж 

"ж Fill backwards from bufend until you hit the given char. 
-x Use -1 to get the whole buffer. 

«ж/ 

static char ж 

OutputGetUntil(OutputHandle oh, char xbufend, int end) 

{ 

Aint c; 


Awhile ((с = OutputGetPrev(&oh)) !- -1 && с != end) 
AAx--bufend = (char)c; 

Areturn bufend; 

2 


/ж 

** Тһе per-page structure. -This is actually global, but describes 
** the values kept for each page processed. 

«ж/ 

typedef struct PerPage { 

ACRC page check; 

Achar const *maxpos, *minpos; 

Aunsigned int tabsize;A/* Zero means this is a binary page х/ 
Aunsigned int lines; 

Aunsigned int геїгіеѕ; А/х How many retires since last progress? ж/ 
Aunsigned int max_retries;A/* Maximum number of retries needed. х/ 
} PerPage; 


PerPage perpage;A/* The global х/ 


static void 

PerPageInit(char const xbuf) 

( 

Aperpage.maxpos = perpage.minpos = buf; 
Aperpage.page_check = 0; 

Aperpage.tabsize = 4; А/х The default х/ 

Aperpage.lines = perpage.retries = perpage.max_retries = Q; 


} 


/ж 

-ж Is the tab substitution being looked at acceptable? 
** It is if the length needed to make the tab width come out 
-x right, it is. -Otherwise, it's junk. 

«ж/ 

HeapCost 

TabFilter(struct ParseNode xparent, char const «limit, 
Astruct Substitution const *subst) 

( 

Aint с, tabpos; 

AOutputHandle oh; 


c98dee A(void) limit; 
4bdbeQ Aif (!perpage.tabsize) 
be65d4 AAreturn COST INFINITY; A/* No interest х/ 


--dbe5 


eQaf5a 
7581са 
564d0c 
41901f 
728e35 
783db5 
a5be11 
705942 
deald5 
838fdd 
f844b2 
40ea2d 
2006cc 
00efe6 
d9af5a 
6138e5 
ec3119 
b946cb 
bcb8da 
b@d39d 
27495d 
678329 
aa2436 
3567db 
f7bb36 
fa4763 
bf88cd 
cbaf5a 
8dfa7e 
c96e63 
baefe6 
41af5a 
718329 
5da5cf 
bc67db 
40bb36 
a6b722 
199а04 
eaaf5a 
b9d482 
d36e63 
5befe6 
68af5a 
7a8329 
c961b5 
1967db 
1ebb36 
ab2d57 
f29a04 
98af5a 
clal6f 
656e63 
90efe6 
78af5a 
с98329 
45ab3c 
f167db 
d7bb36 
a15529 
409a04 
fcaf5a 
5157d1 
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A/* How wide should the tab be? х/ 

Atabpos = (int)((parent-»ps.pos-PREFIX LENGTH) % perpage.tabsize); 
Aif ((int)subst-»outlen !- (int)perpage.tabsize - tabpos) 
AAreturn COST. INFINITY; 

A/* The right number - cost if likely, cost2 if unlikely х/ 

Aif (subst-»cost == subst-»cost2) 

AAreturn subst-»cost; 


AOutputInit(&oh, parent, NULL); 

Ado { 

AAc = OutputGetPrev(&oh) ; 

A) while (c == ' '); 

Areturn (c == TAB CHAR) ? subst->cost : subst->cost2; 

J 

/ж 

** Return cost if near blanks (including end-of-line), cost2 if not, and 
+x the average of there is a blank on one side. -There are additional 
** versions for upper- and lower-case. ·_ is considered upper-case, 
** as it's oftne used in acro identifiers. 

-х/ 

HeapCost 

FilterNearBlanks(struct ParseNode *parent, char const *limit, 


Astruct Substitution const *subst) 

( 

Aint с = OutputPrevChar(parent), score = (isspace(c) != 0); 
Achar const *p = parent->input + parent->subst->inlen; 


Ascore += р == limit || isspace((unsigned char)*p) !- 0; 
Areturn (subst->cost*score + subst->cost2*(2-score))/2; 


} 


HeapCost 
FilterNearUpper(struct ParseNode xparent, char const *limit, 
Astruct Substitution const *subst) 


( 

Aint с = OutputPrevChar(parent), score = (isupper(c) != 0 || с == 9; 
Achar const хр = parent->input + subst->inlen; 

Ascore += p != limit 8% (isupper((unsigned char)*p) 1-0 || хр == '_'); 
Areturn (subst->cost*score + subst->cost2*(2-score))/2; 


} 


HeapCost 

FilterNearXDigit(struct ParseNode *parent, char const *limit, 
Astruct Substitution const *subst) 

( 

Aint с = OutputPrevChar(parent), score = (isxdigit(c) != 0); 
Achar const *p = parent->input + subst->inlen; 


Ascore += p != limit && (isxdigit((unsigned сћаг)хр) != 0); 
Areturn (subst-»cost*score + subst->cost2*(2-score))/2; 


} 


HeapCost 

FilterNearLower(struct ParseNode xparent, char const «limit, 
Astruct Substitution const *subst) 

{ 

Aint с = OutputPrevChar(parent), score = (islower(c) != 0); 
Achar const *p = parent->input + subst->inlen; 


Ascore += p != limit && (islower((unsigned char)*p) != @); 


af6e63 Areturn (subst->cost*score + subst->cost2*(2-score))/2; 
bcefe6 } 
c7af5a 
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bc38e5 /х 

33085e -* cost2 unless previous character was a space (' ' or SPACE CHAR). 
е16деа :* Note the & 255, necessary since chars might be signed and SPACE. CHAR 
3babd8 -* is in the high (negative) half, but c is an int in the range -1..255. 
794954 :х/ 

f28329 HeapCost 

f640b2 FilterFollowsSpace(struct ParseNode *parent, char const *limit, 
2c67db Astruct Substitution const *subst) 

17bb36 { 

c05a6b Aint c = OutputPrevChar(parent); 

2a8dee A(void)limit; 

84e55f Areturn (с == ' ' || с == (SPACE CHAR & 255)) ? subst->cost : subst->cost2; 
3aefe6 ) 

47af5a 

e450be /* cost2 unless previous character was duplicate of this one */ 
b58329 HeapCost 

3fa50f FilterAfterRepeat(struct ParseNode *parent, char const *limit, 
166746 Astruct Substitution const *subst) 

8bbb36 { 

6d5a6b Aint c = OutputPrevChar(parent); 

2a8dee A(void)limit; 

3fcded Areturn (c == subst-»output[0]) 2 subst->cost : subst->cost2; 
95efe6 ) 

6caf5a 

ad76c6 /* cost2 unless probably the closing quote in a char constant */ 
d78329 HeapCost 

fe5552 FilterCharConst(struct ParseNode xparent, char const *limit, 
e767db Astruct Substitution const *subst) 

9cbb36 { 

d4589c AOutputHandle oh; 

3590aa Aint c; 

69af5a 

478dee A(void)limit; 

d5ald5 AOutputInit(&oh, parent, NULL); 

5e8c7d Ac = OutputGetPrev(&oh) ; 

038c7d Ac = OutputGetPrev(&oh) ; 

fbb5aa Aif (c == '\\') 

e444b2 АДс = OutputGetPrev(&oh) ; 

a413fc Areturn (c == '\'') ? subst->cost : subst->cost2; 

4lefe6 ) 

3caf5a 

8338e5 /х 

3c5ea6 -х If the identifier leading up to the current position contains 
6ab437 -* an underscore, then it's likely the current position is an underscore 
2b8df7 -* as well; return cost. -If it does not, it’s less likely; return со5%2. 
244954 :х/ 

1Ғ8329 HeapCost 

68b0fa FilterLikelyUnderscore(struct ParseNode xparent, char const *limit, 
be67db Astruct Substitution const *subst) 

50bb36 { 

e4589c AOutputHandle oh; 

1890aa Aint c; 

96af5a 

c58dee A(void)limit; 

afald5 AOutputInit(&oh, parent, NULL); 

70c3b2 Afor (::) ( 

6544b2 AAc = OutputGetPrev(&oh) ; 

aa63f6 AAif (с == '_') 

e80ec3 AAAreturn subst-»cost; 

de1a64 AAif (lisalnum(c)) 

9368a7 AAAreturn subst->cost2; 

0a8350 A} 


fdefe6 } 
cdaf5a 
59151с /х cost2 unless the following chars seem to be a checksum х/ 


--6a28 


5a8329 
7d2f99 
f867db 
dabb36 
10c48a 
3e9a04 
c9af5a 
545d00 
05fd71 
0992cb 
69Ға71 
463163 
с91839 
b96224 
51230 
0669bc 
f95b13 
17a310 
65efe6 
d3af5a 
824596 
47af5a 
d08206 
7д1ае2 
9eaf5a 
98f5f7 
112978 
Тд4954 
а10036 
ТеЗеде 
fbbef3 
2ееҒеб 
ddaf5a 
ac23c3 
be2978 
3574d6 
a9bb36 
6bbef3 
12clef 
13efe6 
8caf5a 
0fe994 
7b6b7b 
4f4e56 
9abb36 
c76d62 
f3af5a 
5e2f20 
2ce8db 
f37bb9 
50e8b2 
e08350 
29fe54 
dcefe6 
62af5a 
db4aa5 
002978 
5216еа 
f4bb36 
5d1a19 
d55644 
f4efe6 
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HeapCost 

FilterChecksumFollows(struct ParseNode xparent, char const xlimit, 
Astruct Substitution const *subst) 

{ 

Aint i, score = Q; 

Achar const хр = parent->input + subst->inlen; 


Aif (limit - p < PREFIX_LENGTH) 

AAreturn subst->cost2; 

Aif (!isspace((unsigned char)p[PREFIX_LENGTH-1])) 

AAreturn subst->cost2; 

Afor (i = 0; i < PREFIX_LENGTH-1; 1++) 

AAscore += (р[і] >= '0' 88 p[i] <= '9') + (р[1] >= "а" 88 p[i] <= 'f'); 
Ai = (score >= PREFIX_LENGTH-2 ? subst->cost : subst->cost2); 
A/x Magic, since this function is perfect on binary files х/ 
Aif (i « COST INFINITY && perpage.tabsize -- 0) 

AAi = 0; 

Areturn i; 


) 
/ж Manage a *big* pool of ParseNodes х/ 


struct MemPool nodePool; 
struct ParseNode xnodeFreeList = 0; 


/* Prepare for node allocations */ 
static void 

NodePoolInit(void) 

{ 

AmemPoolInit(&nodePool); 
AnodeFreeList = NULL; 


} 


/* Free all nodes in one swell Ғоор */ 
static void 

NodePoolCleanup(void) 

{ 

AnodeFreeList = NULL; 
AmemPoolEmpty(&nodePool) ; 


} 


/х Allcoate a new (uninitialized) node х/ 
static struct РагзеМоде ж 

NodeAlloc(void) 

{ 


Astruct ParseNode *node; 


Anode = nodeFreeList; 

Aif (node) { 

AAnodeFreeList = node->parent; 

AAreturn node; 

A} 

Areturn memPoolNew(&nodePool, ParseNode) ; 


} 


/* Free a node for reallocation */ 
static void 

NodeFree(struct ParseNode хподе) 

{ 

Anode->parent = nodeFreeList; 
AnodeFreeList = node; 


} 


9daf5a 
4638e5 /* 
60bede :* Decrement a node's reference count, freeing it and 


--3dla 


378bf7 
6d68aa 
dc495d 
3e2978 
42e5bb 
34bb36 
148fca 
8f129a 
66af5a 
6350Ғ8 
cic5ab 
dd6454 
2de4c1 
eb5472 
6d65e6 
968350 
d3efe6 
e3af5a 
d3a007 
eaaf5a 
8а3202 
b543e9 
9f8eld 
ebbb36 
6afbd1 
дсаҒ5а 
Ғсс7де 
fe3ceb 
@baf5a 
9a9b9f 
1023fc 
a2a467 
4c3644 
10d1c3 
Баее54 
221а62 
af4998 
Ғ98с38 
365e7d 
10af8a 
31efe6 
a8af5a 
539066 
1b869f 
b@dda7 
8b9780 
cc9960 
4782f7 
2aaf5a 
1ded6d 
9ac0a2 
73af5a 
е06178 
51Ғ341 
е938е5 
cf8d0a 
9a454d 
48518c 
d4495d 
79b5b4 
e035eb 
82bb36 
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** recursively decrementing its parent's if the count 
"ж goes to zero. 

-х/ 

static void 

NodeRelease(struct ParseNode node) 

( 

Astruct РагзеМоде храгепі; 

Aassert(node->refcnt); 


Awhile (!--node->refcnt) { 
AAparent = node->parent; 
А ANodeFree (node) ; 

AAif (!parent) 
AAAbreak; 

AAnode = parent; 

A} 

} 


/х Add nodes to the substitution tree х/ 


/х Create a child of the given node, with the given properties. х/ 
static ParseNode * 
AddChild(ParseNode *parent, Substitution const *subst, HeapCost cost) 


{ 
AParseNode xchild; 


Aif (cost == COST_INFINITY) 
AAreturn 0; 


Acost += parent->cost; 

Achild = NodeAlloc(); 

Axchild = xparent; 

A/* Child is just like parent, except... */ 
Achild->cost = cost; 
Achild->refcnt = 1;A/* The heap х/ 
Achild->input += subst->inlen; 
Achild->subst = subst; 
Achild->parent = parent; 
Aparent->refcnt++; 

Areturn child; 

} 


/* Hash table of nasty lines, indexed by per-line CRC х/ 
struct NastyLine { 

Astruct NastyLine хпех;; 

Achar const *line; 

ACRC crc; 

a 


#define NASTY HASH SIZE 256 
static struct NastyLine *nastyHash[NASTY. HASH. SIZE]; A/* All zero х/ 


struct MemPool nastyStrings, nastyStructs; 

static CRCPoly const xnastyPoly = &crcCCITTPoly; 

/ж 

** Create а new NastyString entry if it doesn't already exist. 

** Note that this expects the string passed to end in a newline which 
** IS hashed but NOT stored 

«ж/ 

static struct Мазђу пе х 

AddNasty(char const string) 


{ 


971fee Asize_t len = strlen(string);A/* Including newline */ 
b4elcc ACRC crc = CalculateCRC(nastyPoly, 0, (byte const *)string, len); 
502358 Astruct NastyLine *nasty, **nastyp = nastyHash + (crc % NASTY_HASH_SIZE); 
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b625c@ Achar xline; 

38af5a 

19са45 A/x Search for an existing copy */ 

378af4 Awhile ((nasty = *nastyp) != NULL) { 

ffb98b AAif (nasty-»crc == crc && 

60b44e AAAmemcmp(nasty->line, string, len-1) == 88 
244507 AAAnasty->line[len-1] == 0) 

fic3e2 AAAAreturn nasty; 

9339d1 AAnastyp = &nasty->next; 

da8350 A) 

613e06 A/* Create a new structure */ 

22531d Axnastyp = nasty = memPoolNew(&nastyStructs, struct NastyLine); 
45454b Anasty-»next = NULL; 

105cac Anasty->line = line = memPoolAlloc(&nastyStrings, len, 1); 
ef6480 Anasty->cre = crc; 

6e2d7c Amemcpy(line, string, len-1); 

£71513 Aline[len-1] = 0; 

b700cd Areturn nasty; 

a3efe6 ) 

34af5a 

e12978 static void 

80a801 RehashNasties(CRCPoly const хроју) 

b6bb36 { 

3f133a Astruct NastyLine хсиг, хћеад; 

559960 ACRC crc; 

74e3d0 Aint i; 

938d0a Asize t len; 

d5af5a 

ff71f9 А/х Put everything into one list and clear the hash table х/ 
5d9452 Ahead = NULL; 

325675 Afor (i = 0; i < (int)elemsof(nastyHash); i++) { 
0b70e7 AAwhile ((cur = nastyHash[i]) != NULL) { 
9ab96c AAAnastyHash[i] = cur->next; 

778850 AAAcur->next = head; 

08c992 AAAhead = сиг; 

835381 AA} 

528350 A} 

1c4c69 A/* Recompute CRCs for the list and redistribute them among the buckets х/ 
@82c61 Awhile (head) 4 

6d904e AAcur = head; 

41344с AAhead = head->next; 

769316 AAlen = strlen(cur->line); 

634256 АДсгс = CalculateCRC(poly, 0, (byte const *)cur->line, len); 
е87005 АДсгс = AdvanceCRC(poly, crc, '“п'); 

84c03f AAcur->cre = crc; 

£00503 AAcur->next = nastyHash[crc % NASTY_HASH_SIZE]; 
db488d AAnastyHash[crc % NASTY_HASH_SIZE] = cur; 
1е8350 А) 

d8f62b AnastyPoly = poly; 

8fefe6 } 

3eaf5a 

8cd837 /х Read in the nastylines file х/ 

£72978 static void 

64dbda ReadNasties(FILE *f) 

b6bb36 { 

9fc@5a Achar buf[128]; 

48af5a 

4f4e10 Awhile (fgets(buf, sizeof(buf)-1, f)) 

1b68eb AAAddNasty (buf) ; 

5cefe6 } 

Ofaf5a 

d838e5 /х 


55ff78 -х Convert an encoded string to binary. 
2eb32c «-х No error checking is performed. 
6e495d -х/ 


--8247 


6cde46 
3d7467 
c3bb36 
1сс333 
80af5a 
9b37de 
2a39e6 
35369e 
59efe6 
71af5a 
8e9135 
912978 
41f2ae 
66bb36 
4dca5d 
33clfa 
ee3ef4 
4b0d7b 
dde9fa 
0c589c 
efaf81 
02ae34 
f9e3d0 
79af5a 
5001eb 
39a312 
19ad53 
94af5a 
040fb7 
875164 
298ef2 
722489 
6a4440 
7e9070 
36a1d5 
9658dd 
423479 
aead76 
раа501 
3d26f4 
b69d78 
5d9164 
c5b3d1 
04af3c 
2cedb® 
ae75d5 
f6e4c9 
c5f9a7 
9351Ғ5 
ab7e7e 
7c9d70 
bd043b 
3f9455 
2355381 
a78350 
d4efe6 
ecaf5a 
cd38e5 
17e7cf 
1b8c90 
9f495d 
e12978 
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static word32 
GetWord32(EncodeFormat const xformat, char const xbuf, int len) 


{ 
Aword32 м = 0; 


Awhile (len--) 
AAw = (w<<format->bitsPerDigit) + DecodeDigit(format, *buf++); 
Areturn w; 


} 


/ж Attempt nasty line substitutions х/ 

static void 

TryNasty(struct ParseNode parent, Heap xheap, char const xlimit) 
( 

Astruct NastyLine const *nasty; 

Astruct Substitution const *subst; 

Astruct ParseNode *child; 

Achar const xend; 

AEncodeFormat const *format = pnFormat(parent); 
AOutputHandle oh; 

Achar buf[4]; 

ACRC check; 

Aint i; 


A/* Make sure the lines are hashed properly */ 
Aif (nastyPoly != format->lineCRC) 
AARehashNasties(format->lineCRC) ; 


A/* Get the line to be replaced */ 
Aassert(parent->ps.pos == PREFIX LENGTH); 

Депа = memchr(parent->input, '\п’, limit - parent->input); 
Aif (lend) 

AAend = limit; 

A/* Get the line's check value */ 
AOutputInit(&oh, parent, NULL); 

А (void)OutputGetPrev (&oh) ; 

Ді = 4; 

Awhile (--і) 

AAbuf[i] = OutputGetPrev(&oh) ; 

Acheck = GetWord32(format, buf, 4); 

A/* Find the matches х/ 

Anasty = nastyHash[check % NASTY. HASH. SIZE]; 

Afor (; nasty; nasty = nasty->next) { 

AAif (nasty-»crc == check) ( 

AAAsubst = SubstNasty(parent-»input, end-parent->input, 
AAAAAAA: -nasty->line, 0); 

AAAif (subst) ( 

AAAAchild = AddChild(parent, subst, NASTY. COST); 
AAAAif (child) { 

AAAAAchild->ps.flags |= PS FLAG DYNAMIC; 
AAAAAHeapinsert(heap, &child->cost); 

AAAA} 


/ж 

-x Form all of а ParseNode's children and add them to the heap. 
-+x Limit is the limit of allowable lookahead. 

-ж/ 

static void 


а11774 AddChildren(ParseNode xparent, Heap хћеар, char const xlimit) 
d2bb36 { 
8b0b21 Achar с = parent-»input[0]; 
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d3bbef ASubstitution *subst = substitutions[@][c & 255]; 

@cfbd1 AParseNode xchild; 

4cdbca AHeapCost cost; 

09af5a 

8307е7 /* If you want to make pure insertion substitutions, do that here х/ 
a7af5a 

996975 Aassert(parent->input < limit); А/ж We always have at least one char */ 
d3af5a 

5895fb Awhile (subst) { 

904288 AAif (subst->inlen == 1 ||A/* Easy case */ 

8fcd6f AAA((size t)(limit-parent-»input) >= subst->inlen 88 
ele6d1 AAAmemcmp(subst->input, parent->input, subst->inlen) == 0)) 
7e0751 AAL 

1cfüac AAAcost = subst->cost; 

254a61 AAAif (subst->filter) 

21ed68 AAAAcost = subst->filter(parent, limit, subst); 

e22f8b AAAchild = AddChild(parent, subst, cost); 

04e3ed AAAif (child) 

064486 AAAAHeapInsert(heap, &child->cost); 

345381 AA} 

054567 AAsubst = subst->next; 

fb8350 A) 

d4af5a 

f7a060 A/* Whole-line substitutions */ 

21c90b Aif (parent->ps.pos == PREFIX LENGTH) 

2738f8 AATryNasty(parent, heap, limit); 

b2efe6 ) 

bfaf5a 

30af5a 

877ba4 /* cost if this line has a dynamic substitution, otherwise cost2 */ 
f78329 HeapCost 

2Ғ53с1 FilterIsDynamic(struct ParseNode xparent, char const xlimit, 
4a67db Astruct Substitution const *subst) 

0abb36 { 

078dee A(void)limit; 

a3206f Areturn (parent->ps.flags & PS FLAG DYNAMIC) ? subst-»cost : subst->cost2; 
7lefe6 } 

eaaf5a 

f07e51 /х cost if the current page is binary mode, else cost2 ж/ 
448329 HeapCost 

04193e FilterIsBinary(struct ParseNode xparent, char const limit, 
6e67db Astruct Substitution const *subst) 

e3bb36 { 

e31282 A(void)parent; (void)limit; 

367054 Areturn perpage.tabsize ? subst->cost2 : subst->cost; 
45efe6 ) 

81af5a 

75сасд /* Debugging utility */ 

b3a5c4 #define DEBUG 1A/* Set to 1 to print every line considered */ 
2caf5a 

a392b2 static size_t lastlen = Q; 

05af5a 

ab2978 static void 

8c91b1 OverstrikeLine(char const *line, size t len) 

aabb36 ( 

1e9194 Astatic size t lastlen - 0; 

8ba282 Aint blanklen; 

f2af5a 

153235 Aif (!line) { 

f428al AAif (lastlen) 

а268е9 AAAputchar(’\n’); 

f64fc5 AAlastlen = 0; 


540943 A} else if (len || Јаз еп) { 
80d7f2 AAif (len > 79) 
51626f AAAlen = 79; 


-- 5694 


515760 
69b83e 
c2783c 
сдес9ь 
е38350 
0ҒеҒеб 
дсаҒ5а 
841a3f 
eb2978 
ac8788 
3fbb36 
7b9277 
3016f8 
ab4fc5 
с58350 
laefe6 
53af5a 
30ef62 
9faf5a 
6d38e5 
67f89e 
baf39a 
ecbe79 
а585а0 
6ad199 
d5495d 
9043е9 
dc9e42 
f7bb36 
2852af 
39af5a 
775e23 
d8cc9e 
96166e 
66bed4 
3e0005 
e031b9 
4b9e61 
702fbe 
ebefe6 
71af5a 
3838e5 
09ba65 
с5сҒад 
Ғ8120Ғ 
68495а 
892978 
05a7ce 
2e309d 
4ebb36 
d4d13b 
9011d7 
57af5a 
4db7e4 
acfb92 
fc@8db 
d7af5a 
4294d5 
4ce22b 
8774b5 
bf55cc 
4307c7 
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AAblanklen = (Лаз еп > len) 2 (int)lastlen - len : 0; 
AAprintf("%.*s%*s\r", (int)len, line, blanklen, ""); 
AAfflush(stdout); 

AAlastlen = len; 

A} 

} 


/х Print everything, for debugging х/ 
static void 

PrintLine(char const *line, size_t len) 
( 

Aif (line) { 

AAprintf("%.*s\n", (int)len, line); 


AAlastlen = 0; 

A} 

} 

static HeapCost ParseAdvanceString(Heap хћеар, ParseNode хрп); 

/ж 

** Copy the parsechain from tail up to root, апа hang it off of 

-x newroot, adjusting the costs and parse state accordingly. -Returns 
** NULL if it is unable to (invalid parse, too expensive, etc.) 

ж Note that as per the convention, ParseAdvanceString is хпоїх called 
** on the new tail node (but is called on all its parents). 

«ж/ 

static ParseNode х 


CopyParse(ParseNode const «tail, ParseNode const *root, ParseNode *newroot) 


{ 


AParseNode *newtail, *parent; 


Aif (tail == root) 

AAreturn newroot; 

Aparent = CopyParse(tail->parent, root, newroot); 

Aif (!parent) 

AAreturn NULL; 

Anewtail = AddChild(parent, tail->subst, ParseAdvanceString(NULL, parent)); 
ANodeRelease (parent) ; 

Areturn newtail; 


} 


/ж 

-ж Replace oldnode with a dynamic substitution for newchar, if possible, 
"ж and fill in the chain down to "tail" just like before, but with no branches. 
** Add the resultant ParseNode to the heap. 

-х/ 

static void 

AddDynamic(Heap xheap, ParseNode const *oldnode, ParseNode const tail, 
Aint newchar) 

{ 

ASubstitution const *subst = oldnode->subst; 

AParseNode *newnode; 


A/x Only replace one-character substitutions */ 
Aif (subst-»outlen !- 1) 
AAreturn; 


Asubst = SubstDynamic(oldnode->subst->input, SubstString(newchar), 0); 
Anewnode = AddChild(oldnode->parent, subst, -1); /* Try it immediately х/ 
Aif (newnode) { 

AAnewnode->ps. flags |= PS FLAG DYNAMIC; 

AAnewnode = CopyParse(tail, oldnode, newnode); 


e6899c AAif (newnode) 
5f568a AAAHeapInsert(heap, &newnode->cost) ; 
b08350 A} 


--4а7е 


7aefe6 
46af5a 
2e38e5 
f13eeb 
bb2@be 
4е9с71 
а2495а 
а82978 
3fc9e0 
c6bb36 
153b4d 
ceaf5a 
5d8fdd 
ffbb74 
ЗБаард 
10af5a 
d@dac8 
c9a90f 
f5efe6 
56af5a 
c438e5 


62b408 · 
825cf6 - 


272b3d 


709ae5 · 
87fla8 - 


16495d 
b72978 
e64f87 
d14ae4 
25bb36 
07a209 
d6af5a 
864e46 
bb8e7c 
c95bc9 
b92238 
4cc144 
11783f 
46feac 
448350 
ca5534 
023475 
Qaefe6 
ddaf5a 
9d38e5 
2a4732 
bbe94b 
94fc4b 
c2495d 
ed2978 
890d7f 
4f469a 
34bb36 
6c@d11 
e890aa 
Qaaf5a 
e83260 
13b649 
ec44b2 
1dde25 
e93d2f 


000f98eb62040010001 Page 16 of repair.c 


) 

/ж 

-x Do the same, at a given (1-based) position on the line. -Owing to 
** a minor glitch, we must never count the tail node, as this has not 
** been parsed yet, so its oldnode->ps.pos field is inaccurate. 

-ж/ 

static void 


AddDynamicAt(Heap хїїеар, int position, ParseNode const «tail, int newchar) 


{ 


AParseNode const хо1дподе = tail; 


Ado { 
AAoldnode = oldnode->parent; 
A} while (oldnode->ps.pos > position); 


Aif (oldnode->ps.pos == position) 
AAAddDynamic(heap, oldnode, tail, newchar); 


) 

[x 

x Given the computed and input check fields, correct the header field 

x that *ends* at the given pos. -This can be used for both the line and 
** page CRC errors by jyst changing the роз. -(It relies on the fact 

x that the page CRC fragment fits into the LineCRC type.) 

x It also relies on the fact that the CRC is at most 4 digits. 

«ж/ 

static void 


ErrorCorrectHeader(Heap xheap, ParseNode const xtail, int pos, 
AEncodeFormat const *format, CRC crc, CRC check) 


{ 
ACRC syndrome = crc ^ check; 


A/* Find the position and the crc digit at that position х/ 
Awhile (syndrome >= (CRC)format->radix) { 

AAif (syndrome & (CRC) (format->radix - 1)) 
AAAreturn; A/* uncorrectable */ 

AApos--; 

ДАсгс >> format->bitsPerDigit; 

AAsyndrome >>= format->bitsPerDigit; 

A} 

A/x Paste in the correct digit */ 

AAddDynamicAt(heap, pos, tail, EncodeDigit(format, crc & (format->radix-1))); 
2 


/ж 

** This function walks back through the line, and if the line CRC could be 
** made correct by changing a character to another legal character, 

** the change is added (on probation) to the substitution table. 

«ж/ 

static void 


ErrorCorrect(Heap хћеар, OutputHandle oh, EncodeFormat const xformat, 


ACRC syndrome) 

( 

AParseNode const *tail = oh.node; 

Aint с; 

Asyndrome = ReverseCRC(format->lineCRC, syndrome, 0); 


Awhile (oh.node->ps.pos > PREFIX LENGTH) { 

AAc = OutputGetPrev(&oh) ; 

AAif (c == '\n' || с == -1) (A/* Can't happen х/ 
AAAprintf("Line ended at pos %d\n”, oh.node->ps.pos); 


13261f AAAreturn; 
365381 AA} 
Ғ9918а AAsyndrome = ReverseCRC(format->lineCRC, syndrome, 0); 


--рас1 


68179 
d5bc5a 
18134 
c562e8 
a28350 
24efe6 
43af5a 
7838e5 


89ecf3 · 
e513ee : 


а17ер9 


e530bc · 
acfdee · 


6a495d 
a43bf3 
eaaf5a 
с02704 
462978 
е22с82 
bfbb36 
21250е 
4d5b5d 
96efe6 
3eaf5a 
bf38e5 
ec4d91 
638409 
c6495d 
5cbeb7 
5e3d@b 
65bb36 
e4589c 
719даа 
а4с802 
790с0Ғ 
c4b250 
f300c6 
el3ecc 
f57efc 
3c7400 
fa7fc3 
flaf5a 
35860b 
ac37d® 
4е4387 
97028с 
cb7ad6 
eldcf9 
13c8ec 
627797 
3d0499 
ab3379 
6aaf5a 
ffle5f 
5a024b 
2983e5 
713e49 
d177b8 
3aaf5a 
c298fb 
72a9a5 
39b3f9 
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AAif (syndrome >= 0x100 || !substitutions[O][c^syndrome] | | 
AAAoh.node->subst->outlen != 1) 

AAAAcontinue; 

AAAddDynamic(heap, oh.node, tail, c^syndrome); 

A} 

} 

/ж 

* Parsing operations. -This is а rather ugly and ad-hoc parser that 
* knows a lot about the fixed-field format produced by the munge 
** utility. "The main state variable is the position іп 

* the line, which controls the expected header, the position of 
* tab stops, and the maximum permissible line length. 

-х/ 

#define OCCASIONALLY 100 


/* Set up a ParseState to top-of-page */ 

static void 

ParseStateInit(ParseState хрѕ) 

{ 

Astatic struct ParseState const parseNull = { 0, 0, 9 У; 
Axps = parseNull; 

} 


/ж 

** Try to accept a newline, checking CRCs and even doing error-correction 
** as appropriate. 

-ж/ 

static int 

ParseNewline(Heap хћеар, ParseNode *pn, char const string) 
{ 

AOutputHandle oh; 

Aint с; 

Achar debugbuf[PREFIX_LENGTH+LINE_LENGTH+10]; 

Achar жћеадег, хроду, хепа; 

Aint pos, width; 

ACRC crc, check; 

AParseNode *temp; 

Astatic int occasionally = OCCASIONALLY; 

AEncodeFormat const *format = pnFormat(pn); 

AEncodeFormat const *headerFormat = &hexFormat; 


A/* Get the line into a buffer for analysis */ 
AOutputInit(&oh, pn, string); 

Aend = debugbuf + sizeof(debugbuf)-1; 

Aheader = OutputGetUntil(oh, end, '\п’); 

A/x Strip leading and trailing whitespace ж/ 

Awhile (header < end 8% isspace((unsigned char)header[®])) 
AAheader++; 

Awhile (header < end 8% isspace((unsigned char)end[-1])) 
AAend--; 

Axend++ = '\n'; 


A/* Start of checksummed area х/ 

Abody = header + PREFIX_LENGTH; 

A/* Blank lines are missing the trainign space from the prefix х/ 
Aif (body >= end) 

AAbody = end-1; 


Acre = CalculateCRC(format->lineCRC, 0, body, end-body); 
Acheck = GetWord32(format, header+2, 4); 
Aif (cre != check) { 


06a0af AAif (!--occasionally) { 
ab4320 AAAOverstrikeLine(header, end-header-1); 
3b4a63 AAAoccasionally = OCCASIONALLY; 
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665381 AA} 

ba0377 ДА/х Try ECC оп the line х/ 

сс8967 АД/х If we haven't already tried ECC on the line... х/ 
494536 AAif (!(pn->ps.flags & PS_FLAG_DYNAMIC)) { 

82c7b7 AAAErrorCorrectHeader(heap, рп, PREFIX LENGTH-1, format, crc, check); 
707d74 AAAErrorCorrect(heap, oh, format, crc ^ check); 

815381 AA} 

478e35 AAreturn COST_INFINITY; 

458350 A} 

fe3f72 А/ж Good enough that we always print it х/ 

адһе4е AOverstrikeLine(header, end-header-1); 

60af5a 

e7f96b А/ж Okay, now there are two cases - header line or running CRC х/ 
de8b08 Aif (pn-»ps.flags & PS FLAG INHEADER) { 

39775a АД/х Do things for first header х/ 

19437с AAif (!(pn->ps.flags & PS FLAG PASTHEADER)) { 

5e93dc AAA/* Check version number ж/ 

4dec42 AAAwidth = EncodedLength(headerFormat, НОВ VERSION BITS); 
56fd2b AAAc - (int)GetWord32(&hexFormat, body, width); 

5declb AAAif (c != 0) ( 

d48760 AAAAfputs("Fatal: you need a newer version of repair" 
c50f5d AAAA · · · · " to process this file\n”, stderr); 

648bb9 AAAAexit(1); 

669455 AAA} 

ab3192 ДАД/х Suck in page CRC, after version & flags */ 

661042 AAApos = width + EncodedLength(headerFormat, HDR FLAG BITS); 
17ef3d AAAwidth = EncodedLength(headerFormat, format->pageCRC->bits) ; 
c9a7d@ AAAperpage.page check = GetWord32(&hexFormat, body+pos, width); 
7с54ер AAA/* Get tab size */ 

f4381d AAApos *- width; 

424c39 AAAwidth = EncodedLength(headerFormat, HDR TABWIDTH BITS); 
162dab AAAperpage.tabsize = GetWord32(&hexFormat, bodytpos, width); 
66af5a 

c9189f AAA/* Once we һауе the header, don't reconsider */ 

288de6 AAAif (!(pn->ps.flags 8 PS. FLAG, PASTHEADER)) 

dd736f AAAAwhile ((temp = (ParseNode *)HeapGetMin(heap)) != NULL) 
с40224 AAAA ANodeRelease(temp); 

fbe8eb AAApn-»ps.page crc = 0; A/* Clear for top of page */ 

625381 AA) 

c94ed1 A} else { 

112951 ДА/х Check the CRC-32 х/ 

f7dc8e AAcrc = CalculateCRC(format->pageCRC, pn->ps.page_crc, body, end-body); 
6257da AApn->ps.page_crc = crc; 

af1372 ДАсгс = RunningCRCFromPageCRC(format, crc); 

6b2f19 AAcheck = GetWord32(format, header, 2); 

669ffc ДАТЕ (crc != check) { 

2746ac AAAif (!(pn->ps.flags & PS. FLAG, DYNAMIC)) 

fe7d74 AAAAErrorCorrectHeader(heap, pn, 2, format, crc, check); 
fd3d1b AAAreturn COST. INFINITY; 

945381 AA) 

6d8350 A) 

57af5a 

c8c894 А/х Hey, it's correct! х/ 

a3321d APrintLine(header, end-header-1); 

01af5a 

7c36b4 A/x Start next line */ 

3e45f8 Apn-»ps.pos = 0; 

a695ad A/x Clear most other flags, but we xhavex got a header х/ 
3e0f82 Ac = pn-»ps.flags & PS. FLAG. DYNAMIC; 

7d32e7 Apn-»ps.flags &- PS FLAG BINEND | PS MASK FORMAT; 

ff3ecd Apn->ps.flags |= PS FLAG. PASTHEADER; 

22e834 A/x 


8cd7d7 Ах Give а bonus to the next line for having completed this опе, 
29c@ad Ax less if it was dynamically fixed. 
£33767 Ax/ 
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434427 Areturn с 2 COST. LINE : COST_LINE*2/3; 

37efe6 } 

14af5a 

6038e5 /* 

2941а6 -x Advance the parse state with pointed-to character. -Returns 

baa70f -х COST INFINITY if an impossible state is reached, otherwise returns а 
585dle «х cost value. : (Normally 0, this can be increased to penalize unlikely 
@baf6d -* output combinations to nudge the correction іп a certain direction.) 
e3495d -х/ 

cfb5fa static HeapCost 

89bd15 ParseAdvance(Heap хћеар, ParseNode xpn, char const xstring) 

b7bb36 { 

609e7d Aint i, retval = 0; 

95b118 Achar c = *string; 

577400 AEncodeFormat const *format = pnFormat(pn); 

ddaf5a 

e9e834 A/x 

11а161 Ах Insist on spaces being correctly converted to SPACE CHAR. 

776756 Ax There's a little irregularity just before EOL. 

7459ac Ax Line contiunation and formfeed are also only legal at EOL. 

b93767 Ax/ 

444815 Aif (с == ' ') { 

с28251 AAif (pn->ps.flags 8 PS FLAG SPACE && !(pn->ps.flags & PS FLAG TAB)) 
cf3def AAApn->ps.flags |= PS FLAG ЕСІ; 

с109са AApn->ps.flags |= PS FLAG. SPACE; 

Та1514 A} else if (pn->ps.flags & PS FLAG EOL) 4 

536963 AAif (с != '\п') 

b33d1b AAAreturn COST. INFINITY; 

fe503b A} else if (c == SPACE CHAR) { 

8a72cc AAif (!(pn-»ps.flags & PS FLAG, SPACE)) 

a33d0f AAApn->ps.flags |= PS FLAG. EOL ; 

1b3573 A} else if (с == CONTIN CHAR || с == FORMFEED. CHAR) { 

883d0f AAApn->ps.flags |= PS FLAG. EOL ; 

804ed1 A} else { 

7353a7 AApn->ps.flags &= ~PS_FLAG_SPACE; 

6р8350 A} 

77af5a 

3d9985 Aswitch (pn->ps.pos) { 

e973bb AAcase 0: 


* 


558174 AAAif (с == ' ' || c == '\п’) { 
ala422 AAA Abreak; ДА/х Ignore ws and blank lines completely */ 
14774с AAA} else if (c == “АҒ” || с == НОВ. РВЕЕТХ CHAR) { 


1a8cfb ДАДА/х Start of a new page х/ 

10d2c9 AAAApn->ps.flags |= PS FLAG INHEADER; A/* Expect header next */ 
abfcc@ AAAAif (c == '\f') 

45fabó AAAAAbreak; 

25е162 ДАДА/х And fall through to increment pos х/ 

171996 AAA} else if (pn-»ps.flags & PS FLAG INHEADER || 

84ffbe AAAAA: -pn->ps.flags & PS FLAG BINEND || 

@e49db AAAAA--!(pn-»ps.flags & PS FLAG PASTHEADER) || 

a05026 AAAAA: -DecodeDigit(format, c) < 0) { 

4f2905 AAAAreturn COST INFINITY;A/* Various illegal cases */ 
7c9455 AAA) 

a60ce0 AA Apn-»ps.pos**; 

305472 AAAbreak; 

292967 АДсазе 1: 

a6ffca AAAif ((pn-»ps.flags & PS FLAG INHEADER)) { 

3e2722 AAA Aformat = FindFormat(c);A/* Second char of header х/ 
be9d70 AAAAif (lformat) 

2dcbe7 AAA AAreturn COST. INFINITY; 

4e5988 AAAAi - registerFormat(format); 

388c90 AAAApsSetFormat(&pn->ps, i); 


639е4Ғ AAAApn->ps.pos++; 
a37ab6 AAAAbreak; 
569455 AAA} 


--86e9 


61db8e 
e47368 
е7дсед 


5754 


72 


2ac603 
949cdf 
5610da 


cbe2 


17 


cd9691 


e774 


54 


fe2laf 
f3db8e 
b67368 
8a0ce0 


da54 


72 


bbcecb 
c217d5 
409e4f 
b57ab6 
e461e2 
e5f5d4 


4194 
da97 


55 
7b 


a4c82e 


Ғ1ад 


3b 


1cbf50 


2dd8 


14 


dfd680 
419524 
7188f9 
с1Ғ004 
7са266 
85с8с1 
12Ғ410 
81815е 
еес8с1 
c59bda 
750480 
асс8с1 
0285a5 
ce34bd 
773f08 
6e81e4 
70f786 
e8ebe6 
a68e88 


398c 


f6 


7bdcab 
ddcd2d 


8d8c 


f6 


0f8e74 
1едср4 


888с 


f6 


28b227 


6452 
9f8c 


f2 
f6 


ea4a92 
ab8ed3 
17043b 
47f35c 
9529e7 
6e1c85 
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AAAif (DecodeDigit(format, c) « 0) 

AAAAreturn COST INFINITY;A/* Illegal */ 

AAApn->ps.pos++; 

AAAbreak; 

AAcase 2: 

AAcase 3: 

AAcase 4: 

dif PREFIX LENGTH != 7 

#error fix this code 

#endif 

AAcase PREFIX_LENGTH-2: 

AAAif (DecodeDigit(format, c) < ®) 

AAAAreturn COST INFINITY;A/* Illegal */ 

AAApn->ps.pos++; 

AAAbreak; 

AAcase PREFIX_LENGTH-1: 

ААА (с = ' ') { 

AAAApn->ps.pos++; 

AAAAbreak; 

AAA} else if (c != 'Nn') 4 

AAAAreturn COST. INFINITY; 

AAA) 

AAA/* Blank lines may be missing this space char х/ 

AAA /*FALLTHROUGH* / 

AA/* The normal line starts here, at position 7 */ 

A Adefault: 

AAAif (pn-»ps.flags 8 PS FLAG INHEADER) {A/* Header line */ 
AAAA/* Format is "--abcd 0123456789abcdef012 Page %и of %5" ж/ 
AAAAint off = pn-»ps.pos - (PREFIX LENGTH*HDR. ENC. LENGTH) ; 
AAAA/* Offset relative to end of hex header х/ 

AAAAif (off < 0) { 

AAAAAif (HexDigitValue(c & 255) « 0) 

AAAAAAreturn COST. INFINITY; 

AAAA} else if (off « 6) ( 

AAAAAif (c != " Page "[off]) A/* Yes, this is legal C х/ 
AAAAAAreturn COST. INFINITY; 

ХАДА) else if (off == 6) { 

AAAAAif (c < '1' || c» '9')A/* First digit of page no. */ 
AAAAAAreturn COST. INFINITY; 

AAAA} else 4 

AAAAA/* Re-base to end of scanned part of page number х/ 
AAAAAoff -= 7 + PageNumDigits(pn); 

AAAAAif (off == 0) { 

AAAAAAif (с >= '0' 88 с <= '9' && PageNumDigits(pn) « 3) 
AAAAAAAPageNumDigitsIncrement (pn); 

AAAAAAelse if (c != ' ') 

AAAAAAAreturn COST. INFINITY; 

AAAAA} else if (off « 4) { 
AAAAAAif (с != " of "Loff]) 
AAAAAAAreturn COST. INFINITY; 
AAAAA} else if (off == 4) { 
AAAAAAif (lisgraph(c)) 
AAAAAAAreturn COST. INFINITY; 

AAAAA) else if (c < ' ' || (с & 255) > '~') 4 
AAAAAAif (c != '“п') 
AAAAAAAreturn COST. INFINITY; 
AAAAAAreturn ParseNewline(heap, pn, string); 


AAAAA} 

AAAA} 

AAA} else if (!perpage.tabsize) {A/* Radix-64 line */ 
AAAA/* Line is "RINFVF9UQU== "Мп" */ 


AAAAif (isspace(c & 255)) { 


ba867a AAAAAif (!(pn->ps.flags & PS_FLAG_BINWS)) { 
ed3562 AAAAAAif ((pn->ps.pos - PREFIX LENGTH) % 4 != 0) 
538сҒ6 AAAAAAAreturn COST. INFINITY; 
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рда153 AAAAAApn->ps. flags |= PS_FLAG_BINWS; 

de58c3 AAAAAAif (pn-»ps.pos - PREFIX LENGTH < BYTES_PER_LINE*4/3) 
d323f2 AAAAAAApn->ps.flags |= PS FLAG. BINEND; 

b18ed3 AAAAA} 

8782b4 AAAAAif (с == '\n') 

914a92 AAAAAAreturn ParseNewline(heap, pn, string); 

11c4a6 AAAA} else if (pn->ps.flags & PS FLAG BINWS) { 

65cbe7 AAAAAreturn COST INFINITY; 

е1с859 АЛДА) else if (с == RADIX64 END. CHAR) { 

6c7d2e AAAAAif ((pn-»ps.pos - PREFIX LENGTH) % 4 < 2) 

9fc8c1 AAAAAAreturn COST. INFINITY; 

add206 AAAAApn-»ps.flags |= PS FLAG BINEND; 

aafd42 AAAA} else if (pn->ps.flags & PS FLAG BINEND) { 

b4cbe7 AAAAAreturn COST. INFINITY; 

d535cc AAAA} else if (Radix64DigitValue(c) < 0) { 

56cbe7 AAAAAreturn COST. INFINITY; 

09043b AAAA) 

8346d2 AAA) else {A/* Normal line */ 

899c@e AAAA/* Make sure tab stops come out right */ 

4b5cd5 AAAAif (pn->ps.flags & PS. FLAG. TAB) { 

b2f5e3 AAAAAif (((pn-»ps.pos - PREFIX LENGTH) % perpage.tabsize) == 0) 
475f7b AAAAAApn-»ps.flags &= “Р5 FLAG TAB; 

af8d72 AAAAAelse if (c != TAB PAD CHAR && с != '“п') 4 

7bf9ed AAAAAAreturn COST INFINITY; A/* Illegal */ 

7e8ed3 AAAAA} 

ee043b AAAA} 

513566 ДАДА/х 

f0de03 AAAA* Yes, this code has hard-coded ASCII assumptions 
a6fed1 AAAA*x It knows that the acceptable range of '\п', ' '..'~', 
198208 AAAAx TAB. CHAR, FORMFEED CHAR is in that order. 

461cd8 AAAA* Signed char machines have it backwards, to be confusing. 
cb462d AAAAx/ 

79ce27 AAAAif ((с & 255) < ' ") ( 

93efa4 ДАДАДА/х Newline! (Or something illegal) х/ 

дозаде AAAAAif (с != '\п’) 

а1с8с1 AAAAAAreturn COST_INFINITY; 

8ea153 AAAAAreturn ParseNewline(heap, pn, string); 

55043b AAAA} 

f6b88b AAAA/* A normal character */ 

382cle AAAAif ((с 8 255) > '~') 4 

е21Ғ37 AAAAAif (pn-»ps.flags & Р5 FLAG INHEADER) 

cof9ed AAAAAAreturn COST INFINITY; A/* Illegal х/ 

87596 AAAAAif (с == TAB. CHAR) 

79f5f3 AAAAAApn->ps.flags |= PS. FLAG. TAB; 

8caa5b AAAAAelse if (с != FORMFEED CHAR && с != SPACE CHAR 8% 
02b9f6 AAAAAAAAAAAA:-:c != CONTIN CHAR) 

39f9ed AAAAAAreturn COST. INFINITY;A/x* Illegal х/ 

b6043b АДАД) 

819455 AAA) 

a619b2 AAAif (++pn->ps.pos > PREFIX LENGTH + LINE. LENGTH) 
5df5d4 AAA Areturn COST. INFINITY; 

645472 AA Abreak; 

338350 А) 

be3163 Areturn retval; 

7cefe6 ) 

87af5a 

3738e5 /х 

1f7e29 -* Run the parser over the string in a ParseNode (using repeated calls 
334956 -х to ParseAdvance). -Return the penalty cost, or COST INFINITY if 
6a3b99 -x it's impossible 

ad495d -х/ 

43b5fa static HeapCost 


498Ғ60 ParseAdvanceString(Heap хћеар, ParseNode хрп) 
2dbb36 { 
ееда12 AHeapCost cost, total = 0; 
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3bf97d Achar const *string = pn->subst->output; 
f9af5a 

da6f76 Awhile (*string) { 

ea9ea4 AAcost = ParseAdvance(heap, pn, string++); 
e08339 AAif (cost == COST INFINITY) 

e8did2 AAAreturn cost; 

ec3d97 AAtotal += cost; 

268350 A} 

а744Ғ3 Areturn total; 

4fefe6 } 

5daf5a 

а6707с static unsigned int *globalStats = NULL; 
d08c04 static unsigned globalSize = 0; 

fa4b81 static unsigned globalEdits = 0; 


99af5a 

6538е5 /х 

dff@3c -x This walks the list of substitutions, performing two tasks with 
de552c -х the statistics gathered. 

c3775e -ж 

дсе59с -х First, although not essential, it prints any interesting changes 
2d31da -* (non-identity substitutions) made, and a count of the total number 
fe4a8a -х of substitutions (including identity) as an approximate character count. 
dd775e -х 

a396a3 :* Second, it does maintenance on dynamic (learned during program 
e3667c -х execution) substitutions. -It discards any substitutions that end 
dc198b -* up unused, and computes nice costs for the others, based on the 
df8c72 -x global (per-file) statistics. 

3b775e -х 

a5ela7 :* (This function is also called at the end to print the per-file stats, 
@8d225 :* which does redundant weight adjustment, but it's harmless.) 

644954 -х/ 


ce2978 static void 

2c451f UseStats(unsigned int *stats, FILE *log) 

79bb36 4 

6ecfle Aunsigned int i, j, n, changes = Q; 

bd58ca Aunsigned long grand = 0; 

3477e1 ASubstitution *s, **sp; 

f6af5a 

de8532 Aif (!stats) 

тадвабђ AAreturn; 

c8af5a 

6baa69 A/* Yes, this loop is permuted on purpose */ 
3f46ea АҒог (ў = 9; j < elemsof(xsubstitutions); j++) { 
9a61c1 AAfor (i = 0; i < elemsof (substitutions); i++) { 
c965ad AAAsp = &substitutions[i]Ljl; 

017366 AAAwhile ((s = *sp) != 0) { 

fdlebf AAAAgrand += n = stats[s->index]; 

861а40 AAAA/* Retain or purge dynamic substitutions, depending. */ 
52acac AAAAif (SubstIsDynamic(s)) { 

508123 AAAAAif (n) { 

751ff7 AAAAAASubstAdjust(s, n); 

ad@e5b AAAAA} else if (!globalStats[s->index]) 4 
e3ab92 AAAAAA/x Forget unused dynamic substitutions ж/ 
978654 AAAAAAxsp = s-»next; 

67628a AAAAAAif (SubstIsNasty(s)) 

bca3b4 AAAAAAAfree((char х) ѕ->іприї); A/* Dynamically allocated */ 
3cb663 AAAAAASubstFree(s); 

3f0257 AAAAAAcontinue; 

1d8ed3 AAAAA} 

3f043b AAAA} 

а25953 AAAAsp = &s->next; 

853566 AAAA/* 


c@1db8 ДАДАх Print interesting substitutions. :Some boring substitutions, 
16bbce AAAAx flagged with an index value of zero, аге not printed. 
1b462d AAAAx/ 
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be8bb9 AAAAif (!s-»index || !n) 

4b4521 AAAAAcontinue; 

1bc47c AAA Achanges += n; 

ae5798 AAAAfprintf(log, "\t%2ux \"%.xs\"%xs-> \"%.*xs\"%xs(cost ”, 
74ab0d AAAAA: -stats[s->index], (int)s->inlen, s->input, 

5a76b0 AAAAA-:s-»inlen»3 ? 0 : 3-(int)s->inlen, "", 

тредб7 AAAAA- · (int)s->outlen, s->output, 

ffef49 AAAAA-:s-»outlen»3 ? 0 : 3-(int)s->outlen, ""); 


e3639e AAAAfprintf(log, s->cost == COST INFINITY ? "-" : "Xd", s-»cost); 
c43blc AAAAif (s->filter) 

8ab677 AAAAAfprintf(log, s->cost2 == COST INFINITY ? "/-" : "/%d", 
9a7b27 AAAAA: ++» s->cost2); 

26b9c7 AAAAfputs(SubstIsDynamic(s) 2 ") хх LEARNED **\n" : ")\n", log); 
a39455 AAA) 

525381 AA} 

228350 A} 

e42fcb Afprintf(log, "\tTotal: %u changes (out of %lu)\n", changes, grand); 
ffefe6 } 

e9af5a 


e12978 static void 

a6ab08 DoStats(ParseNode const xnode, unsigned int page, FILE xlog) 
fcbb36 ( 

7Ғ9Ғ32 Aunsigned int xstats; 

bddbcf Aunsigned int n; 

e7af5a 

с75764 A/x Enlarge global stats if needed х/ 

d21d07 Aif (globalSize < substCount) { 

aed88c AAstats - realloc(globalStats, substCount * sizeof(*stats)); 
3c3918 ДАТЕ (!stats) -{ 

c04afc AAAfputs("Fatal error: out of memory for stats! Nn", stderr); 
86ee06 AAAexit(1); 

5c5381 AA} 

f5eb32 AAfor (n = globalSize; n « substCount; п++) 

f95608 AAAstats[n] = 0; 

6036a5 AAglobalStats = stats; 

878ee3 AAglobalSize - substCount; 

318350 A} 

5baf5a 

дедада A/* Allocate per-page stats */ 

7е6779 Astats = calloc(substCount, sizeof(*stats)); 

7194с0 Aif (!stats) { 

5с16е9 AAfputs("Fatal error: out of memory for stats!\n", stderr); 
c86e00 AAexit(1); 

e98350 A} 

ef64a2 A/* Cheat and assume that calloc() initializes unsigned ints to zero */ 
5ec744 Awhile (node) ( 

5c95ee AAstats[node->subst->index]++; 

9b5c60 AAnode = node->parent; 

fe8350 A) 

26af5a 

e0485a А/ж Keep the global counts accurate х/ 

938316 Afor (n = 0; п < substCount; n++) 

5a2a83 AAglobalStats[n] += stats[n]; 

00af5a 

58ac08 Afprintf(log, "Page Xu substitutions: Nn", page); 

f3ba58 AUseStats(stats, log); 

04af5a 

2b67ae Afree(stats); 

5fefe6 ) 

2caf5a 

ald5fa /* Spit out a page of data (needs work). -Returns number of lines х/ 
af65bf static unsigned 


2е0223 PrintPage(OutputHandle oh, FILE xout) 
69bb36 { 
f71ae7 Achar pagebuf[PAGE_BUFFER_SIZE]; 
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824ce9 Achar *p1;A/* Beginning of current line х/ 

67fa9f Achar *p2;A/* End of current line (WS stripped) */ 
720710 Achar *p3;A/* End of current line (newline) */ 
81753f Achar хр4; А/ж End of all output */ 

b8ed3f Aunsigned lines = 0; 


35af5a 
c580e3 Ap4 = pagebuf + sizeof (pagebuf) ; 
25e039 Ар1 = OutputGetUntil(oh, p4, -1); 
89af5a 


70c8f9 А/ж Output the lines without leading 4 trailing whitespace х/ 
6f2be6 Awhile (p1 < ра) 4 

758374 AA/* Identify the line х/ 

9f61d@ AAp3 = memchr(p1, “Ха”, р4-р1); 

d8b956 AAif (!p3) 

b32a4c AAAp3 = p4; 

дс7769 AA/x Delete leading whitespacee */ 

1923с8 AAwhile (isspace((unsigned char)*p1) && р1 < p3) 
е96Ғ30 ДАДр1++; 

575640 AA/* Delete trailing whitepace х/ 

2са7Ғ0 AAp2 = p3; 

b2a015 AAwhile (isspace((unsigned char)p2[-1]) && p1 « p2) 
7defbf AAAp2--; 

еса71а AA/* Spit out this line х/ 

574560 AAfwrite(pl, 1, (size t)(p2-p1), out); 

5136а5 AAputc('\n', out); 

fd5837 AA/* Advance p1 past the newline х/ 

1cd926 AAp1 = рз + 1; 

doe9f4a AAlines++; 

068350 A) 

b2727a Areturn lines; 

68efe6 ) 

laaf5a 

3d8e61 static volatile int interrupt = 0; 

786cee static void (* volatile oldhandler) (int) = SIG_DFL; 
69af5a 

561bdd static void inthandler(int sig) 

62bb36 { 

55134c Aif (++interrupt > 2) 

51edbe AA(void)signal(sig, oldhandler); 

d7efe6 ) 

66af5a 

da38e5 /х 

7e0462 -х Given a buffer, process a page from it and try to write a corrected page to 
7f129a -х the out file. -Return the number of bytes accessed. · (0 if it was unable 
4217a0 -х to make any corrections.) 

ba495d :*/ 

ee5d09 static size t 

ee4b35 DoPage(char const *buf, size t len, FILE *out, unsigned int page, FILE *log) 
d9bb36 ( 

e65c91 AParseNode xnode; 

53fe58 AHeap heap; 

badbca AHeapCost cost; 

1b589c AOutputHandle oh; 

089bc3 Avoid (*sighandler) (int); 

f3af5a 

fafc12 AHeapInit(&heap, 1000); 

7742a9 ANodePoolInit(); 

c9a4e8 APerPageInit(buf); 

e5af5a 

cf7f8d A/* Initialize signal handling */ 

fad83a Ainterrupt = 0; 

188e7b Asighandler = signal(SIGINT, inthandler); 


х 


4Ғ744Ғ Aif (sighandler != inthandler) 
2e5576 AAoldhandler = sighandler; 
37af5a 
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86511d A/* Make a root node */ 

381898 Anode = NodeAlloc(); 

Теад7е Anode->cost = 0; 

6a9d98 Anode->refcnt = 1; 

b342f4 Anode->input = buf; 

4d2cab Anode->subst = &substNull; 

а4Ғ066 AParseStateInit (&node->ps) ; 

bef5f1 Anode->parent = NULL; 

fbaf5a 

e70f75 AHeapInsert(&heap, &node->cost); 

4aaf5a 

fa889a A/* The main loop: try to extend the current parse. */ 
ce296b Awhile ((node = (ParseNode *)HeapGetMin(&heap)) != NULL) { 
0Ғ97с4 AAcost = ParseAdvanceString(&heap, node); 

472e5e AAif (cost != COST_INFINITY) { 

048352 AAA/* End of the file, or hit a second header line? х/ 
a22177 AAAif (node->input == buf+len || InSecondHeader(&node->ps)) { 
e5ea2a AAAA/* Try to wrap up page, if page CRC works */ 
db8007 AAAAif (node-»ps.page crc == perpage.page check) { 
7f9ad2 AAAAA/* Success! */ 

f3fe55 AAAAAHeapDestroy (&heap) ; 

a7412a AAAA AOutputInit(&oh, node, NULL); 

b5de89 AAAAAOverstrikeLine("”", 0); 

4daf5a 

6f59a1 AAAAAif (InSecondHeader (&node->ps)) { 

623583 AAAAAA/* Back up to last newline х/ 

147619 AAAAAAoutputInit(&o0h, node, NULL); 

fb76f6 AAAAAAwhile (OutputGetPrev(&oh) != '“п') 

94fba2 ДАДАДАД; 

#3786 AAAAA AOutputUnget (&oh) ; 

2d8ed3 AAAAA} 

646с55 AAAAA/* oh points to node that emitted last char оп page х/ 
26a237 AAAAAlen = oh.node->input - buf; /* Chars eaten this page */ 
2624bc AAAAAperpage. lines = PrintPage(oh, out); 

76611с AAAAADoStats(oh.node, page, log); 

501880 AAAAANodePoolCleanup() ; 

211Ғ31 AAAAAreturn len; 

a3043b AAAA} 

09е013 AAA} else { 

617ec2 ДАДА/х Keep working on the page х/ 

9ff924 AAAAnode->cost = cost += node->cost; 

8f93b8 AAAAif (node->input > perpage.maxpos) { 

f3ca99 AAAAAperpage.maxpos = perpage.minpos = node->input; 
ef80f5 AAAAAif (perpage.max retries < perpage.retries) 
c915bf AAAAAAperpage.max retries = perpage.retries; 
ac2eaa AAAAAperpage.retries = 0; A/* Made progress х/ 
9775ec АЛДА) else if (node->input < perpage.minpos) { 
485600 AAAAAperpage.minpos = node->input;A/* Furthest backtrack ж/ 
2e043b АЛАЛ} 

56cdd3 AAAA++perpage.retries; 

96с2са AAAAif (heap.numElems > MAX HEAP || interrupt) 
3ffe55 AAAAAHeapDestroy(&heap) ; 

e12f7d AAAAelse 

bbb4c3 AAAAAAddChildren(node, &heap, buf+len); 

ef9455 AAA} 

b45381 AA} 

476574 AANodeRelease(node); 

378350 A) 

79а2а4 Д/х Failed! х/ 

бе4еде AOverstrikeLine(NULL, 0); 

f5528d Aputs("Stopping for manual edit."); 

04af5a 


с3992Ғ ANodePoolCleanup() ; 
9494bc A/* Get rid of the dynamic substitutions ж/ 
097bcd ADoStats(NULL, page, log); 
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eQaf5a 
elbced 
6eefe6 
2baf5a 
8d80e4 
40beb7 
0d8df3 
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c3af40 
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cfaf40 
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dc5381 
c7859e 
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Areturn 0; 
2 


/х The magic file-shuffling routine. х/ 

static int 

RepairFile(char const *name, char const *editor, char const xnastylines) 
{ 

Achar buf[PAGE_BUFFER_SIZE]; 

Achar *filename; 

Achar const хр; 

Asize_t namelen; 

AFILE жіп = 0, *out = 0, хдитр = 0, *log = 0; 

Asize_t inbytes; ДА/х Bytes in input buffer х/ 

Asize_t outbytes;A/* Bytes taken from input buffer ж/ 

Aunsigned int pages = 0; Д/х й of pages processed */ 

Aunsigned int lines = 0; Д/х й of lines processed (until trouble) х/ 
Aunsigned int minline, maxline;A/* Where is the error? х/ 

Aint giveup; АДД /ж Have we had to abort corrections? */ 

Aint err; ААД/ж Copy of errno for returns х/ 


AglobalSize = 0; А/х Reset global stats */ 


Anamelen = strlen(name) ; 

Aif (!(filename = malloc(namelen+10))) ( 
AAp = "Unable to allocate memory\n”; 
AAgoto error; 

A} 


Amemcpy(filename, name, namelen); 
Astrcpy(filename*namelen, ".log"); 
Aputs(filename) ; 

Aif (!(log = fopen(filename, "at"))) { 
AAp = "Unable to open log file \"%s\"\n"; 
AAgoto error; 


A} 

Astrcpy(filename*namelen, ".out"); 
Aputs(filename); 

Aif (!(out = fopen(filename, "at"))) { 


AAp = "Unable to open output file \"%s\"\n"; 
AAgoto error; 
A) 


retry: 

A/* Read in any new nasty lines */ 

Aif (!(іп = fopen(nastylines, "rt"))) { 
AAfprintf(stderr, "Unable to open nasty lines file \"%s\"\n", nastylines); 
A} else { 

AAReadNasties(in) ; 

AAfclose(in); 

A} 

A/* Try to open input file - .in or original х/ 
Ap = filename; 

Astrcpy(filename*namelen, ".in"); 

Aif (!(in = fopen(filename, "rt"))) ( 

AAif (!(іп = fopen(name, "rt"))) 4 
AAAfilename[namelen] = 0; 

AAAp = "Unable to open input file \"%s\"\n"; 
AAAgoto error; 

AA) 

AAp = name; 


968350 A} 
315428 Aprintf("Repairing from %s\n", р); 
c4bf86 Astrcpy(filename+namelen, ".dmp"); 
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150142 Aif (!(dump = fopen(filename, "wt"))) 4 

daa4d1 AAp = "Unable to open output file \"%s\"\n"; 

bbaf40 AAgoto error; 

ac8350 A} 

87af5a 

ec2e7a Agiveup = 0; 

973299 Ainbytes = 0;A/* Bytes already at the front of the buffer х/ 
80ae86 A/* Append more data from the file ж/ 

aebd66 Awhile ((inbytes += fread(buf+inbytes, 1, sizeof(buf)-inbytes, in)) != 0) ( 
e10545 AAif (giveup) { 

605c05 AAA/* Giving up mode - just copy through х/ 

a7a558 AAAoutbytes = fwrite(buf, 1, inbytes, dump); 

b41b7b AAAif (loutbytes) { 

3c7866 AAAAp = "Error writing dump file!\n”; 

36b223 AAAAgoto error; 

6a9455 AAA} 

ae6015 AA} else { 

дедеез AAAoutbytes = DoPage(buf, inbytes, out, равеѕ+1, log); 
cedad7 AA ANodePoolCleanup(); 

6a326e AAAif (outbytes) { 

67а9с4 AAAApages++; 

d98510 AAAAlines += perpage.lines; 

30658b AAA} else (A/* Failed х/ 

044c14 AAAA/x Find range of backtracking for error location ж/ 
38957e AAAAminline = 1; 

02243c AAAAfor (p = buf; ·р < perpage.minpos; p++) 

ееедсо AAAAAminline += (хр == 'in'); 

042708 AAAAfor (maxline = minline; p < perpage.maxpos; p++) 
6fda54 AAAAAmaxline += (xp == '\п'); 

42b91d AAAAgiveup = 1; 

019455 AAA) 

ac5381 AA} 

7срае7 AA/* Fewer bytes now in the buffer х/ 

598c57 AAinbytes -- outbytes; 

541f2f AA/x* Move those bytes to the front again */ 

8ce87b AAmemmove(buf, buftoutbytes, inbytes); 

258350 A} 

bfaf5a 

a53b2d Afclose(in); 

de88c5 Ain = 0; 

20765e Afclose(dump); 

145eb8 Adump = 0; 

41af5a 

e7e49b А/х Okay, let's get tricky */ 

37a950 Amemcpy(buf, name, namelen); 

85cd@c Astrcpy(buf*namelen, ”.dmp"); 

f336ee Astrcpy(filename+namelen, ".in"); 

laaf5a 

9dc09b Д/х teun: MS Visual C doesn't rename on top of existing file; remove it х/ 
c67653 Aif (remove(filename) != 0) { 

b2d132 AAerr = errno; 

628ee1 AAfprintf(stderr, "Warning deleting %s\n", filename); 
278350 A} 

аба#5а 

12804c Aif (rename(buf, filename) != 0) { 

044132 AAerr = errno; 

bc250b AAfclose(out); 

5c37f2 AAfclose(log); 

де9716 AA/* teun: corrected buf, filename order. This cost me an hour х/ 
cefc2f AAfprintf(stderr, "Error renaming Xs -> %s\n", buf, filename); 
7df176 AAreturn err; 

038350 А) 


69af5a 
d8351f A/* This code is spaghetti - is there a cleaner way? х/ 
ee4233 Aif (giveup) ( 
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d4cfd9 
0662ab 
498с39 
3с57с0 
94ed39 
31773a 
434136 
9e4845 
8a6015 
338есд 
7ее8са 
4de3c2 
2ce8ca 
724136 
33393с 
8е5381 
d4a5d7 
50bff3 
0Ғрр92 
fa83f5 
15e8a4 
3c1e01 
b2c6ea 
af8350 
cfaf5a 
f27fle 
470258 
21af5a 
725120 
6da301 
93445f 
ccef85 
f06de7 
e3af5a 
e2bced 
72af5a 
4060ff 
778b27 
377dcb 
2ағағд 
6ҒҒа24 
2dcbf9 
а503ес 
сс0258 
ед21ае 
fdefe6 
23af5a 
2749cc 
244d2f 
898599 
48bb36 
feafb8 
с0Ғ706 
9debe9 
5734f8 
6daf5a 
e719bb 
9c279c 
3451f2 
4cc9f2 
d@af5a 
91с635 
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AAprintf("Error іп Xs, lines %u-%u\n”, filename, minline, maxline); 
AAfprintf(log, "Error in Xs, lines %u-%u\n”, filename, minline, maxline); 
AAif (interrupt » 1) 

AAAgoto manual; 

AAif (editor) 4 

AAAif (stremp(editor, "-") == 0) 

AAAAgoto manual; 

AAAsprintf(buf, editor, maxline, filename); 

AA} else { 

AAAp = getenv(" VISUAL") ; 

AAAif (Ip) 

AAAAp = getenv("EDITOR"); 

AAAif (Ip) 

AAAAgoto manual; 

AAAsprintf(buf, "Xs +%u %s\n", p, maxline, filename); 
AA) 

AAprintf("Executing %s\n", buf); 

AAglobalEdits++; 

AAif (system(buf) == 0) 

AAAgoto retry; 

AAfputs("Edit failed - aborting\n”, stderr); 

manual: 

AAputs("Please fix the error by hand and re-run repair."); 


A} 


Afclose(out) ; 
Afree(filename) ; 


Afprintf(log, "\n%u lines successfully processed.\n”, lines); 
Afprintf(log, "Overall substitutions (Xu pages):\n", pages); 
AUseStats(globalStats, log); 

Aprintf("Xu manual edits required\n”, globalEdits); 
Afclose(log); 


Areturn 0; 


error: 
Aerr - errno; 

Aif (log) fclose(log); 

Aif (dump) fclose(dump); 

Aif (out) fclose(out); 

Aif (in) fclose(in); 
Afprintf(stderr, p, filename); 
Afree(filename); 

Areturn err; 


J 


/* Process the command line, calling RepairFile as needed. */ 
int 

main(int argc, char xargv[]) 

( 

AintAAresult = 0; 

AintAAi; 

Achar const xeditor - NULL; 

Achar const *nastylines = "nastylines"; 


AlnitUtil(); 

AsubstBuild(); 
AmemPoolInit(&nastyStructs); 
AmemPoolInit(&nastyStrings); 


A/x Process leading flags */ 


78bedc Afor (і = 1; і < argc 88 агом[1][0] == '-'; i++) 4 
a3e905 AAif (агву(11111 == '-' 88 агву(11121 == 0) 4 
bd1916 ДАД1++; 
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275472 AAAbreak; 

bf21e8 AA} else if (argv[il[1] == 'е') 4 

1451f3 AAAeditor = агву(11121 2 argvlil+2 : агам[++1]; 

15049b AA} else if (argv[i][1] == '1') { 

1e3fbe AAAnastylines = argv[il[2] ? argv[i]*2 : агеу[++11; 

366015 AA} else { 

78513 AAAeditor = argv[i][2] ? агем[1]+2 : argv[**il; 

078931 AAAfprintf(stderr, "ERROR: Unrecognized option %s\n", argvlil); 
310381 AAAreturn 1; 

365381 AA) 

1d8350 A} 

9baf5a 

09d19d A/x Process files х/ 

93a6cf Afor (; i < argc; i++) { 

93fd56 AAresult - RepairFile(argv[i], editor, nastylines); 

e5e36e AAif (result != 0) { 

ba47e0 AAAfprintf(stderr, "Fatal error: %s\n", strerror(result)); 
cb0381 AAAreturn 1; 


bc5381 AA) 

888350 A} 

d3af5a 

72bced Areturn 0; 
deefe6 } 

5caf5a 

7238e5 /х 

30e6c5 «х Local Variables: 
c69b19 -x tab-width: 4 
8fe7a4 -х End: 

620612 -х уі: %5-4 sw-4 
8e6c42 -х vim: si 


444954 :x/ 
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bc38e5 /х 

86f601 -x util.c -- Miscellaneous shared code/data 

b1775e -х 

82абес :* Copyright (C) 1997 Pretty Good Privacy, Inc. 

82775е -х 

147659 -* Written by Mark Н. Weaver 

f6775e -x 

45с540 -* $Id: util.c,v 1.11 1997/11/07 00:44:10 шим Exp $ 

314954 -х/ 

e7af5a 

eebea3 #include <stdlib.h> 

0d490a #include "util.h" 

94af5a 

764048 char const hexDigits[] = "0123456789abcdef"; 

231ad8 char const radix64Digits[] = 

fb7334 #if @A/* Standard */ 

cf639f A"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghi jkimnopgrstuvwxyz0123456789+/"; 
c3a934 #elseA/* Modified form that avoids hard-to-OCR characters х/ 

00Ғ795 A"ABCDEFGHIJKLMNPQRSTVWXYZabcdehi jklmnpqtuwy145689\\*!#$%&*+=/:<>?@"; 
647454 #endif 

d7af5a 

f1879a signed char hexDigitsInv[256]; 

9dief@ signed char radix64DigitsInv[256]; 

bbaf5a 

5fea65 /* teun: moved intitialisation of all three CRCPoly's to initUtil() */ 
23af5a 

c73dce /* CRC-CCITT: x^16 + x^12 + x*5 + 1 х/ 

206497 CRCPolyAcrcCCITTPoly; 

f238e5 /х 

ардеад :* PRZ's magic 24-bit polynomial - (x*1) ж (irreducible of degree 23) 
5clea3 :* x^24 *x^23 *x^18 +х^17 *x^14 +х^11 *x^10 +х^7 *x^6 *x^5 +х^4 *x^3 +x +1 
59b91c -х (Developed by Neal Glover). -Note: this is bit-reversed from the form 
204814 -* used in PGP, 0x1864cfb. 

be495d :*/ 

cc5f89 CRCPolyAcrc24Poly; 

762607 /* CRC-32: х^32+х^26+х^23+х^22+х^16+х^12+х^11+х^10+х^8+х^7+х^5+х^4+х^2+х+1 х/ 
408680 CRCPolyAcrc32Poly; 

9faf5a 

2fc059 EncodeFormat constAhexFormat - 

83bb36 ( 

a7b67a АМИ, ДАДА/х nextFormat х/ 

494942 A'-',AAAA/*x headerTypeChar */ 

aebe52 AhexDigits, ДАД/* digits х/ 

670894 AhexDigitsInv,AA/x digitsInv х/ 

8dcble A4, ДДАДА /х bitsPerDigit */ 

f5bd5f A16, AAAAA/* radix х/ 

91f351 A&crcCCITTPoly, АЛ/ж lineCRC ж/ 

9668с2 A&crc32Poly, AAA/* pageCRC х/ 

dc407b A8, AAAAA/* runningCRCBits */ 

36abe9 A24, AAAAA/* runningCRCShift */ 

aa407a AOxFFAAAA/* runningCRCMask ж/ 

c582f7 ); 

ecaf5a 

01e00d EncodeFormat constAradix64Format = 

cebb36 ( 

c22212 A&hexFormat, AAA/* nextFormat х/ 

83a7d4 A'A', AAAA/*x ћеадегТуреСћаг */ 

9e7e81 Aradix64Digits, AA/* digits */ 

5af323 Aradix64DigitsInv,A/x digitsInv */ 

d659af А6, AAAAA/*« bitsPerDigit */ 

8a9f5c A64, AAAAA/* radix х/ 

3747а7 A&crc24Poly, AAA / ж lineCRC х/ 


* 


0Ғ68с2 A&crc32Poly, ДАД/х pageCRC х/ 
73589е A12, AAAAA/* runningCRCBits х/ 
39571е A20, AAAAA/* runningCRCShift */ 
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46e25b ДОХРРЕДАДА/х runningCRCMask ж/ 

f882f7 ); 

50af5a 

799478 EncodeFormat const *AfirstFormat = &radix64Format; 
5baf5a 

67af5a 

143eff static void InitCRCPoly(CRCPoly хроїу) 

24bb36 ( 

2e32b3 AintAAi, oneBit; 

@d231e ACRCAAcrc = 1; 

cdaf5a 

80d5e9 Apoly->table[0] = 0; 

d8568b Afor (oneBit = 0x80; oneBit > 0; oneBit >>= 1) ( 
a37aa7 AA crc = (crc >> 1) ^ ((crc 8 1) 2 poly->poly : 0); 
17d9a4 AAfor (i = 0; i < 0x100; i += 2 x oneBit) 

df34a6 AAApoly->table[i + oneBit] = poly->table[i] ^ crc; 
3f8350 A) 

63efe6 ) 

caaf5a 

a3df6f CRC CalculateCRC(CRCPoly const хроју, CRC crc, 
10b870 AAAAbyte const *buffer, size t length) 

e8bb36 { 

7f9adc Awhile (length--) 

дабсс8 АДсгс = (crc >> 8) ^ poly->table[(cre & O0xFF) ^ (xbuffer++)]; 
f2c57f Areturn crc; 

ceefe6 ) 

70af5a 

bd7f13 СКС ReverseCRC(CRCPoly const хроју, СКС crc, byte b) 
a3bb36 ( 

556cd5 AintAAi, highBit = poly->highBit; 

1baf5a 

87a521 Afor (i = 0; i < 8; i++) { 

c361f3 AAif (crc & highBit)AA/* highBit is 2*(poly->bits-1) */ 
894019 AAAcrc = ((crc ^ poly->poly) << 1) ^ 1; 

6848ba AAelse 

827506 AAAcrc <<= 1; 

с68350 A} 

5c2e8a Areturn crc ^ b; 

abefe6 } 

Qaaf5a 

03f81e static void InitDigitsInv(char const xdigits, signed char xdigitsInv) 
5dbb36 { 

dcf706 AintAAi; 

b2af5a 

034636 Afor (i = 0; і < 256; i++) 

7е5р00 AAdigitsInv[i] = -1; 

55575с Afor (i = 0; digits[i]; i++) 

447сад AAdigitsInv[(byte)digits[i]] = i; 

18efe6 } 

fdaf5a 

те1зад /х Returns the number of chars encoded х/ 

524b67 int EncodeCheckDigits(EncodeFormat const xfmt, word32 num, 
418944 AAAAA int numBits, char *dest) 

e9bb36 { 

fef9c@ AintAAdestLen = EncodedLength(fmt, numBits); 
d66a00 Aword32AdigitMask = fmt->radix - 1; 

88Ғ706 AintAAi; 

5faf5a 

b7ef55 Afor (1 = destLen - 1; i >= 0; i--) 

aed780 А 

66d65@ AAdest[i] = EncodeDigit(fmt, num & digitMask); 
f89ff3 AAnum >>= fmt->bitsPerDigit; 


2d8350 A} 
879a91 Areturn destLen; 
2defe6 } 


--68de 00038а5ра0140010001 Page 3 of util.c 


eQaf5a 

584352 /* Returns 1 if there's an error х/ 
72f54b int DecodeCheckDigits(EncodeFormat const xfmt, char const *src, char xxendPtr, 
b37fl1c AAAAA-int numBits, word32 *valuePtr) 
a3bb36 ( 

390f01 Aword32Avalue = 0; 

30d918 AintAAdigitValue; 

36b42e AintAAi = EncodedLength(fmt, numBits); 
5baf5a 

a72453 Awhile (i--) 

834780 АС 

229916 AAdigitValue = DecodeDigit(fmt, *src++); 
dccc05 AAif (digitValue < 0) 

ef0751 ДАГ 

ec0097 AAA/* Invalid digit found х/ 

c39d18 AAAxvaluePtr = 0; 

3e20af AAAif (endPtr) 

а97454 AAAAxendPtr = NULL; 

140381 AAAreturn 1; 

с25381 AA} 

18cf@d AAvalue = (value << fmt->bitsPerDigit) | digitValue; 
ee8350 A} 

e78aba AxvaluePtr = value; 

41aa62 Aif (endPtr) 

761е1с AAxendPtr = (char *)src; 

58bced Areturn 0; 

f3efe6 } 

97af5a 

а19се8 EncodeFormat const xFindFormat(char headerTypeChar) 
a6bb36 { 

253116 AEncodeFormat const *Afmt = firstFormat; 
diaf5a 

86af8e Awhile (fmt 8% fmt->headerTypeChar != headerTypeChar) 
12696e AAfmt = fmt->nextFormat; 

ed9be3 Areturn fmt; 

9eefe6 } 

8baf5a 

bc7925 void InitUtil() 

8fbb36 ( 

1694a2 А/ж teun: removed "{ )" for MS VC compile х/ 
2caf5a 

bc266b AcrcCCITTPoly.bits = 16; 

a04763 AcrcCCITTPoly.poly - 0x8408; 

7e44d® AcrcCCITTPoly.highBit = 0x8000; 

e3af5a 

022058 Acrc24Poly.bits = 24; 

e9bfbd Acrc24Poly.poly - 0xdf3261; 

bc5cdd Acrc24Poly.highBit - 0x800000; 

7аа#5а 

c572ff Acrc32Poly.bits = 32; 

032c51 Acrc32Poly.poly - 0xedb88320; 

aff26c Acrc32Poly.highBit - 0x80000000; 

Ғда#5а 

911747 AInitCRCPoly(&crcCCITTPoly); 

247eba AInitCRCPoly(&crc24Poly); 

9b2ef6 AInitCRCPoly(&crc32Poly); 

aa6d84 AlnitDigitsInv(hexDigits, hexDigitsInv); 
b75f3d AlnitDigitsInv(radix64Digits, radix64DigitsInv); 
48efe6 } 

f3af5a 

аба#5а 

b038e5 /x 


80e6c5 ·х Local Variables: 
139b19 -* tab-width: 4 
38e7a4 «х End: 
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67b612 -* vi: ts=4 sw-4 
7b6c42 .х vim: si 
ee495d -х/ 
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bc38e5 /х 

597856 -х һеар.с -- Simple priority queue. -Takes pointers to cost values 
7c0b95 «-х (presumably the first field in a larger structure) and returns 
4d3169 -х them in increasing order of cost. 

d3775e -х 

52абес -* Copyright (C) 1997 Pretty Good Privacy, Inc. 

18775e -х 

3e3157 -х Written by Colin Plumb and Mark Н. Weaver 

6c775e «х 

4дае23 -х $Id: heap.c,v 1.2 1997/07/05 02:55:23 colin Exp $ 

614954 -х/ 

41af5a 

f32e22 #include <stdio.h>A/*x For fprintf(stderr, "Out of memory") х/ 


66ddae 
dbaf5a 
26609b 
d@af5a 
d8ba07 
02cd44 
7bf8c1 
90Ға79 
асе999 
79df2e 
65c07d 
65c774 
6daf5a 
272978 
4bb5fe 
5fbb36 
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7401f8 
23af5a 
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cled20 
559f67 
f85472 
6fa77a 
е1а21е 
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b75381 
8d46b0 
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1c603d 
45cf85 
708350 
3ab9ba 
ceefe6 
38af5a 
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be314a 
2a9c5e 
7fbb36 
15fc8e 
f6af5a 
2db04b 
c18f41 
b654c® 
ffefe6 
43af5a 
d8e540 
bc43e0 


#include <stdlib.h>A/* For malloc() 8 co. */ 
#include "heap.h" 


#define HeapParent(iJAAA((i) / 2) 
#define HeapLeftChild(i)AA((i) * 2) 
#define HeapRightChild(i)AA((i) * 2 + 1) 
#define HeapElem(h, i)AAA(h)-»elems[i] 
#define HeapMinElem(h) AAAHeapElem(h, 1) 
#define HeapElemCost(e) AAA (*(e)) 


#define HeapCost(h, iJAAAHeapElemCost (HeapElem(h, i)) 
#define HeapSize(h) AAAA ((h) -»numElems) 

static void 

SiftDown(Heap const хћеар, HeapCost хе) 

{ 

AHeapIndex size = HeapSize(heap), parent = 1, child; 
AHeapCost cparent = HeapElemCost(e), cchild; 

Afor (;;) 4 

AAchild = 2xparent; 

AAif (child > size) 

AAAbreak; 

AAcchild = HeapCost(heap, child); 

AAif (child < size 88 cchild > HeapCost(heap, child*1)) { 
AAAcchild = HeapCost(heap, child+1); 

AAAchild++; 

AA} 

AAif (cparent <= cchild) 

AAAbreak; A/* Stop sifting down */ 

AAHeapElem(heap, parent) = HeapElem(heap, child); 
AAparent = child; 

A} 

AHeapElem(heap, parent) = e; 

} 

/х Debug tool: verify heap property х/ 

void 

HeapVerify(Heap хћеар) 

€ 

AHeapIndex i; 

Afor (1 = 2; 1 <= HeapSize(heap); 1++) 

AAif (HeapCost(heap, i) < HeapCost(heap, HeapParent(i))) 
AAAfprintf(stderr, "DEBUG: VerifyHeap failed at elem %d\n”, i); 
} 

/* Remove and return the minimum cost from the heap. */ 
HeapCost * 


9577ba HeapGetMin(Heap *heap) 
aabb36 { 
857eaf AHeapIndex lastElem = HeapSize(heap) ; 
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70a39f AHeapCost *retval; 

c8af5a 

b7d175 Aif (!lastElem) 

820005 AAreturn NULL; 

Тд48с6 Aretval = HeapMinElem(heap) ; 

d74d7a AHeapSize(heap) = lastElem-1; 

084dal ASiftDown(heap, HeapElem(heap, lastElem)); 

9f3163 Areturn retval; 

74efe6 } 

bdaf5a 

fd36bd /х Helper - set heap size, reallocating if needed х/ 
3d2978 static void 

059256 HeapResize(Heap хћеар, HeapIndex newNumElems) 

d4bb36 { 

f3ee05 Aif (newNumElems >= heap->elemsAllocated) { 

5ec794 AAHeapIndex newAllocSize = heap->elemsAllocated х 2; 
e4af5a 

a52f48 AAif (newAllocSize <= newNumElems) 

7fa63a AAAnewAllocSize = newNumElems + 1; 

885eal AAheap->elems = (HeapCost **)realloc((void *)heap->elems, 
e36842 AAAAAAAAA : sizeof (*heap->elems) ж newAllocSize); 
bd9dee AAif (heap->elems == NULL) { 

0ac095 AAAfprintf(stderr, "Fatal error: Out of memory growing Пеар\п”); 
f7ee06 AAAexit(1); 

bd5381 AA} 

963dc7 AAheap->elemsAllocated = newAllocSize; 

448350 A) 

b9cbaa Aheap-»numElems - newNumElems; 

63efe6 ) 

62af5a 

f7baee /х Add an element to the heap х/ 

7d314a void 

39e842 HeapInsert(Heap хћеар, HeapCost *newElem) 

39bb36 { 

db3edb AHeapIndex parent, і = ++HeapSize(heap); 

febef4 AHeapCost cost = HeapElemCost(newElem); 

06af5a 

ae400a AHeapResize(heap, i); 

76783с A/x Sift up until parent = 0 х/ 

44be1b Awhile ((parent = HeapParent(i)) && HeapCost(heap, parent) > cost) { 
3fef36 AAHeapElem(heap, i) = HeapElem(heap, parent); 
fd0817 AAi = parent; 

368350 A} 

5e2ffe Aheap->elems[i] = newElem; 

52efe6 ) 

fdaf5a 

597777 /х Initialize a new heap */ 

93314a void 

4881c9 HeapInit(Heap хћеар, HeapIndex initSize) 

5fbb36 { 

2414e1 AinitSize++; A/x Add one for temporary element х/ 
4claal Aif (initSize < 1) 

ced2ae AAinitSize = 1; 

69be6b Aheap->elems = (HeapCost **)malloc(initSize ж sizeof(*xheap->elems)); 
2cd9d9 Aif (heap->elems == NULL) { 

bc575c AAfprintf(stderr, "Fatal error: Out of memory creating heap\n”); 
9a6e00 AAexit(1); 

188350 A} 

54е420 Aheap->elemsAllocated = initSize; 

07863с Aheap->numElems = 0; 

14efe6 } 

69af5a 


f28353 /* Free up а heap's resources. */ 
7e314a void 
d®ae85 HeapDestroy(Heap хћеар) 
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2cbb36 { 

52efcd Afree((void *)heap->elems) ; 
989308 Aheap->elemsAllocated = Q; 
b2863c Aheap->numElems = 0; 
1bad4e Aheap->elems = NULL; 
e6efe6 } 

c8af5a 

f238e5 /х 

76e6c5 -х Local Variables: 

b19b19 -х tab-width: 4 

a9e7a4 -х End: 

f1b612 -* vi: ts-4 sw=4 

126с42 :* vim: si 

604954 -х/ 
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bc38e5 /х 

972005 -* mempool.c - Pooled memory allocation, similar to GNU obstacks. 
26775e -х 

10304a -х $Id: mempool.c,v 1.5 1997/11/13 23:53:08 colin Exp $ 

764954 -х/ 

b4bb5f #include <assert.h> 

09Ғер2 #include <stdio.h> 

ca324c #include <string.h> 

£07499 #include <stdlib.h>A/* For malloc() 8 free() */ 


b4af5a 

19eb10 #include "mempool.h" 

07af5a 

4438e5 /х 

d49985 -х The memory pool allocation functions 

a8775e -х 

a08154 -х These are based on a linked list of memory blocks, usually of uniform 
60bfdd -* size. ·Мем memory is allocated from the tail of the current block, 
7541cf -* until that is inadequate, then a new block is allocated. 

c25b44 -х The entire pool сап be freed at once by calling memPoolFree(). 
604954 -х/ 


917186 struct PoolBuf { 

71Ғс26 Astruct PoolBuf хпехі; 

e56da9 Aunsigned size; 

dd7f81 A/* Data follows х/ 

048247 У; 

afaf5a 

4a400b /х The prototype empty pool, including the default allocation size. х/ 
a990e5 static struct MemPool EmptyPool = { 0, 0, 0, 4096, 0 , 0, Q}; 

98af5a 

2d3aee /х Initialize the pool for first use х/ 

cf314a void 

7528aa memPoolInit(struct MemPool xpool) 

5abb36 { 

a6388b Axpool = EmptyPool; 

92efe6 } 

34af5a 

bdeb07 /x Set the pool's purge function х/ 

61314a void 

а23928 memPoolSetPurge(struct MemPool хроо1, int (*purge)(void ж), void хаге) 
5dbb36 { 

e4c6fb Apool->purge = purge; 

8ec404 Apool->purgearg = arg; 

f6efe6 } 

c5af5a 

497723 /х Free all the memory іп the pool х/ 

e2314a void 

92d398 memPoolEmpty(struct MemPool хроо1) 

7fbb36 { 

bc35fa Astruct PoolBuf *buf; 

деаҒ5а 

363772 Awhile ((buf = pool->head) != 0) { 

550fbc AApool->head = buf->next; 

57de9c AAfree(buf); 
148350 A} 

d41c25 Apool->freespace 
fb51a8 Apool->totalsize 
9defe6 ) 

e4af5a 

ddaf5a 

db38e5 /х 

а76с3с -х Restore a pool to a marked position, freeing subsequently allocated 
aee68e ·х memory. 


д; 
д; 


754954 :х/ 
e2314a void 
fee028 memPoolCutBack(struct MemPool xpool, struct MemPool const xcutback) 
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2cbb36 ( 

7535fa Astruct PoolBuf *buf; 

c5af5a 

53bb7f Aassert(pool); 

4f3892 Aassert(cutback); 

7d31f6 Aassert(pool->totalsize >= cutback->totalsize); 

95af5a 

@5dcc1 Awhile((buf = pool->head) != cutback->head) { 

460fbc AApool->head = buf->next; 

5ede9c AAfree(buf); 

048350 A) 

74a6a5 Axpool = xcutback; 

eeefe6 ) 

d5af5a 

4a38e5 /х 

a3c88e -х Allocate а chunk of memory for a structure. -Alignment is assumed to be 
7fdb5a -* a power of 2. -It could be generalized, if that ever becomes relevant. 
f6c2bd -* Note that alignment is from the beginning of an allocated chunk, which 
957204 -х is guaranteed by ANSI to be as aligned as can possibly matter. 
90495d -х/ 

579a44 void х 

1ffd49 memPoolAlloc(struct MemPool xpool, unsigned len, unsigned alignment) 
83bb36 { 

a83fdc Achar хр; 

1c71d3 Aunsigned t; 

aQaf5a 

d43ad6 А/ж Where to allocate next object х/ 

aab3le Ap = pool->freeptr; 

b5343f Д/х How far it is from the beginning of the chunk. */ 

0fd592 At = p - (char *)pool->head; 

055f36 A/* How much to round up freeptr to make alignment х/ 

a60d88 At - -t & --alignment; 

4caf5a 

5359fb А/ж Okay, does it fit? */ 

13e99d Aif (pool->freespace >= len*t) { 

89f451 AApool->freespace -= len+t; 

4bc874 АДр += t; 

949274 AApool->freeptr = р + len; 

fe3a9d AAreturn p; 

е18350 A} 

3laf5a 

8d2e62 A/* It does not fit in the current chunk. "бо for a bigger chunk. */ 
20af5a 

a22dcd A/x First, figure out how much to skip at the beginning of the chunk ж/ 
deecbb Aalignment &- -(unsigned)sizeof(struct PoolBuf); 

aa6e06 Aalignment += sizeof(struct PoolBuf); 

24с7да A/* Then, figure out a chunk size that will fit х/ 

ca71c8 At = pool->chunksize; 

985e73 Aassert(t); 

ee7553 Awhile (len + alignment > t) 

c90a2e AAt x= 2; 

3dc554 Awhile ((p = malloc(t)) == 0) { 

edb4ab AA/x If that didn't work, try purging or smaller allocations х/ 
24bael AAif (!pool->purge || !pool->purge(pool->purgearg)) { 

44306d AAAt /= 2; 

be42db AAAif (len * alignment » t) 

7fd808 AAAAfputs("Out of тетогу!\п”, stderr); 

af309c AAAAexit (1); A/* Failed */ 

065381 AA) 

0c8350 A) 

abaf5a 

13ac83 A/* Update the various pointers. х/ 


х 


1f345d Apool->totalsize += t; 
5111с7 A((struct PoolBuf *)p)->next = pool->head; 
e1256f A((struct PoolBuf *)p)->size = t; 
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46635f Apool->head = (struct PoolBuf *)p; 
233ce2 Apool->freespace = t - len - alignment; 
e11892 Ap += alignment; 

3eac47 Apool->freeptr = p + len; 

5caf5a 

48ba9b Areturn p; 

flefe6 } 
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bc38e5 /х 

d5e662 :* munge.c -- Program to convert a text file into "munged" form, 
4fc8ee «х ........... suitable for reconstruction from printed form. -Tabs аге 
e7acdb :* rer. made visible and checksums are added to each line and each 
78767 “ж ws ret Ee нка page to protect against transcription errors. 

e0775e -ж 

@4a5ec :* Copyright (С) 1997 Pretty Good Privacy, Inc. 

6a775e -х 

dfbd99 -* Designed by Colin Plumb, Mark Н. Weaver, and Philip R. Zimmermann 
аҒ7659 -x Written by Mark Н. Weaver 

е7775е -ж 

459028 -х $Id: munge.c,v 1.32 1997/11/12 23:28:53 шим Exp $ 

b9495d :*/ 

ebaf5a 


93feb2 #include <stdio.h> 

of495d #include <errno.h> 

с8324с #include <string.h> 

4dblcb #include <ctype.h> 

4fbea3 #include <stdlib.h> 

62af5a 

ff490a #include "util.h" 

c3af5a 

2338е5 /х 

3c3e8e -х The file is divided into pages, and the format of each page is 
сс775е «х 

245085 -- 1414 000b2dc79af40010002 Page 1 of munge.c 
45af5a 

9fdb26 bc38e5 / 


8fb899 403838 -х munge.c -- Program to convert a text file into munged form 
@ac4d8 647222 -x 

fbc7a9 193f28 -* Copyright (C) 1997 Pretty Good Privacy, Inc. 

Бее13с 827222 -x 

96723a 699025 -* Designed by Colin Plumb, Mark Н. Weaver, and Philip К. Zimmermann 
237055 0d050c «х Written by Mark Н. Weaver 

62775е -х 

02323b :* Where the first 2 columns are the high 8 bits (in hex) of a running 
452e82 :* CRC-32 of the page (the string "--", unlikely to be confused with 
34345f -* any digits, indicates a page header line) and the next 4 columns 

3e7357 -х are a CRC-16 of the rest of the line. ·Тћеп a space (not counted in 
8bc9ab :* the CRC), and the line of text. ·Табз are printed аз the currency 
d8a04a :* symbol (ISO Latin 1 character 164) followed by the appropriate number 
341930 -* of spaces, and any form feeds are printed as a yen symbol (Latin 1 165). 
bcb41d :* The CRC is computed on the transformed line, including the trailing 
f68916 -x newline. "Мо trailing whitespace is permitted. 

cd775e -х 

bfb@9c -х The header line contains a (hex) number of the form Offcccccccctpppnnnn, 
9206c8 -* where the digit 0 is a version number, ff are flags, ccccccc is the CRC-32 
f8f055 -х of the page, t is the tab size (usually 4 or 8; д for binary files that 
1df99e -* are sent in radix-64), ppp is the product number (usually 1, different 
53dbd7 -х for different books), and nnnn is the file number (sequential from 1). 
51775e "ж 

86d19a -* This is followed by " Page Xu of " and the file name. 

33495d -х/ 

c7af5a 

011390 typedef struct MungeState 

e2bb36 { 


dddfa® AEncodeFormat const *Afmt; 

8fcd59 AEncodeFormat const xAhFmt; 

85e4d3 AintAAAAbinaryMode, tabWidth; 

a63cec AlongAAAorigLineNumber ; 

c88f76 AlongAAAproductNumber, fileNumber, pageNumber, lineNumber; 
dc2f23 Aunsigned longAfileOffset; 


дебеза ACRCAAAApageCRC; 
45a79d Achar const *AfileName; 
f387cc Achar const *AfileNameTail; 
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e4c92f Achar *AAApageBuffer; А/ж Buffer large enough to hold one page х/ 
457407 Achar *AAApagePos; A/* Current position in pageBuffer х/ 

633436 Aword16AAAhdrFlags; 

830b9a AFILE xAAAfile; 

64edfa AFILE *AAAout; 

3de329 } MungeState; 

7аа#5а 

5caf5a 

a8596a void ChecksumLine(EncodeFormat const xfmt, char const *line, size t length, 
3441с3 ДАДА сћаг жргебіх, CRC *pageCRC) 

93bb36 { 

c4aa44 ACRCAAAlineCRC; 

0852f0 ACRCAAArunCRCPart = 0; 

eeaf5a 

d20755 AlineCRC = CalculateCRC(fmt->lineCRC, 0, (byte const *)line, length); 
ddab57 Aif (pageCRC != NULL) 

53d780 Af 

0b55ec AAxpageCRC = CalculateCRC(fmt->pageCRC, xpageCRC, 

f37a9b AAAAAAAA (byte const *)line, length); 

985а79 AArunCRCPart = RunningCRCFromPageCRC(fmt, *pageCRC); 

158350 A} 

9aaf5a 

f081b4 Aprefix += EncodeCheckDigits(fmt, runCRCPart, fmt->runningCRCBits, prefix); 
d154e9 Aprefix += EncodeCheckDigits(fmt, lineCRC, fmt->lineCRC->bits, prefix); 
38af5a 

5d62f4 Axprefixt+ = ' ';A/* Write a space over the null byte х/ 

97efe6 } 

a4af5a 

58533a /х Returns 1 for convenience ж/ 

6a310b int PrintFileError(MungeState xstate, char const message) 

01bb36 { 

2dc289 Afprintf(stderr, "Xs in Xs %s %lu\n”, message, state->fileName, 
2bedfc AAAstate->binaryMode ? "offset" : "line", 

6a36dc AAAstate->binaryMode ? state->fileOffset : state->origLineNumber) ; 
31e631 Areturn 1; 

18efe6 } 

blaf5a 

296ba3 int MungeLine(MungeState *state, char «buffer, int length, 

7a3ca7 AAA:char *line, int *bufferUsed) 

33bb36 { 

a7df8a AintAAi = 0, j = 0, jOld = 0; 

д1341а AcharAch; 

33af5a 

р72а71 Afor (i = 0; i < length 88 j < LINE LENGTH; i++) 

6ad780 А 

а81841 AAjOld = j; 

271da7 AAch = buffer[i]; 

а84674 AAif (ch == '\+') 


170751 ДАХ 
798195 AAAline[j++] = TAB. CHAR; 
e5bfOf8 ----:- /*if (state->tabWidth « 1) 


9cf976 AAAAreturn PrintFileError(state, 

508799 AAAAAAAAA-"ERROR: Tab found in radix64 stream”); 
a52376 AAAelse 

206325 AAAAwhile (j X state->tabWidth && j < LINE. LENGTH) 
bf9744 AAAAAline[j**] = TAB. PAD CHAR; */ 

d15381 AA} 

b72c26 AAelse if (ch == '\п') 

d80751 AA( 

ba4bad AAAif (i + 1 « length) 

c3f976 AAAAreturn PrintFileError(state, 

4fb342 AAAAAAAA"UNEXPECTED ERROR: fgets read past newline!?"); 


dc5472 АДДргеак; 
425381 AA} 
Ғдс9Ғе AAelse if (ch == '\f') 
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300751 AAL 

145472 AAAbreak; 

395381 AA} 

3c8e3a AAelse if (ch == ' ' 88 (j <= 0 || line[j-1] == ' ' || 
e45e65 AAAAAAA-:-line[j-1] == SPACE CHAR || 

0fa2f7 AAAAAAA: -1+1 >= length || buffer[i*1] == '\n')) 

ae0751 АА 

5da4ae AAAline[lj++] = SPACE CHAR; 

a20f67 AAJA 

9d640e AAelse if (ch >= ' ' && ch <= '~') 

079e95 AAAline[j**] = ch; 

6348ba AAelse 

5f3011 AAAreturn PrintFileError(state, "ERROR: Non-ASCII char"); 
748350 A) 

33af5a 

088а60 Aif (i < length 8% buffer[i] == '\п') 

32d780 Af 

1e72da АЛі++; 

4ffe26 AAstate->origLineNumber++; 

b78350 A} 

b9b8e7 Aelse if (i < length 88 buffer[i] == ТАҒ” 88 j < LINE. LENGTH) 
e2d780 ДАТ 

5d72da АЛі++; 

86a277 AAline[j++] = FORMFEED. CHAR; 

678350 A) 

01c252 Aelse 

634780 А 

700822 AA/* If there's no newline, we need to add the continuation marker ж/ 
40e254 AAif (i > 0 8% j >= LINE LENGTH) 

bf0751 ДАГ 

0с7488 AAA/* Remove the last character if we're out of room */ 
238455 AAAi--; 

1454ff AAAj = 1014; 

0b5381 AA} 

5a1f84 AAline[j++] = CONTIN_CHAR; 

388350 A} 

91af5a 

073ele A/* Strip trailing spaces */ 

80aaee Awhile (j > 0 88 isspace((unsigned char)line[j - 11)) 
63f255 AAj--; 

c9af5a 

а77са2 Aif (j > LINE LENGTH)A/* This should never happen х/ 
4936af AAreturn PrintFileError(state, "ERROR: Internal error, line too long"); 
b4af5a 

498674 А/ж Add trailing newline and NULL х/ 

1fbfaa Aline[j++] = "Мп"; 

bec4fe Aline[j++] = 70%; 

ecaf5a 

e0902f A/* Return number of chars used from buffer х/ 

51рас1 AxbufferUsed = i; 

91af5a 

16bced Areturn 0; 

alefe6 ) 

elaf5a 

282978 static void 

9757е8 Encode3(byte const src[3], char dest[4]) 

28bb36 { 

f24eab Adest[0] = radix64Digits[ еее (src[0]»»2 8 0х37)]; 
48df7e Adest[1] = radix64Digits[(src[0]««4 & 0x30) | (src[1]»»4 & 0x0f)]; 
499224 Adest[2] = radix64Digits[(src[1]««2 & 0x3c) | (src[21>>6 & 0x03)]; 
ec76ad Adest[3] = radix64Digits[(src[2] ::-& 0x3f)]; 

d2efe6 } 


d4af5a 
efbeb7 static int 
78ee58 EncodeLine(byte const *src, int srcLen, char *dest) 


--ce31 00022f814c740010001 Page 4 of munge.c 


2cbb36 { 

дадвес Achar xAdestp = dest; 

d91d13 AbyteAtempSrc[3]; 

58af5a 

a532d5 Afor (; srcLen >= 3; srcLen -= 3) 

еҒа780 А 

7467af AAEncode3(src, destp); 

с37е93 AAsrc += 3; destp += 4; 

588350 A} 

21af5a 

017674 Aif (srcLen > 0) 

944780 А 

a6c611 AAmemset(tempSrc, 0, sizeof(tempSrc)); 

4a2e50 AAmemcpy(tempSrc, src, srcLen); 

fa67af AAEncode3(src, destp); 

1d112d AAsrc += 3; destp += 4; srcLen -= 3; 

с71аср AAwhile (srcLen < 0) 

9d85df AAAdestp[srcLen++] = RADIX64 END CHAR; 

718350 A} 

5aaf5a 

fe40e6 Areturn destp - dest; 

e4efe6 } 

7faf5a 

a2beb7 static int 

@bb9ab MungeBinaryLine(MungeState xstate, byte const xbuffer, int length, char line) 
9bbb36 { 

244ef5 AcharAbinLine[128]; 

1с6655 AintAAbinLength; AAA/x Destination length ж/ 

16bffe AintAAused; 

3faf5a 

e86e89 AbinLength = EncodeLine(buffer, length, binLine); 

deaf5a 

239300 A/* Append newline х/ 

829d8a AbinLine[binLength++] = "Мп"; 

709146 AbinLine[binLength] = 706; 

93af5a 

e93dc7 Areturn MungeLine(state, binLine, binLength, line, &used); 
ffefe6 } 

53af5a 

f5cad4 int MaybePageBreak(MungeState xstate) 

2abb36 { 

cb@565 AEncodeFormat const *Afmt = state->fmt; 

0c2af8 AEncodeFormat const *AhFmt = state->hFmt; 

9aaf5a 

7dfd19 Aif (state->lineNumber >= LINES. PER PAGE) 

214789 А 

82dcbd AAcharAline[512]; 

11788a AAchar xAlineDataA- line + PREFIX LENGTH; 

7реса1 AAchar «АрАЛА- lineData; 

cd@b73 AA 

Тасдеа АДр += EncodeCheckDigits(hFmt, 0, HDR VERSION BITS, р); 
359104 AAp += EncodeCheckDigits(hFmt, state->hdrFlags, HDR FLAG BITS, p); 
444bdd AAp += EncodeCheckDigits(hFmt, state->pageCRC, fmt->pageCRC->bits, p); 
1c9153 AAp += EncodeCheckDigits(hFmt, state->tabWidth, HDR TABWIDTH BITS, p); 
20519e AAp += EncodeCheckDigits(hFmt, state-»productNumber, НОВ PRODNUM BITS, р); 
ес2ҒҒ0 АДр += EncodeCheckDigits(hFmt, state->fileNumber, HDR FILENUM BITS, p); 
aaaf5a 

85c07b AAsprintf(p, " Page %ld of %s\n", state->pageNumber + 1, 
5612e9 AAAAstate->fileNameTail); 

9aaf5a 

edc88f AAif (strlen(lineData) > LINE_LENGTH + 1) 

4e0751 ДАГ 


e3e8d2 AAAPrintFileError(state, "ERROR: Header line too long"); 
44с67е AAAfprintf(stderr, "> Xs", lineData); 
22b24d AAAreturn -1; 
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665381 AA} 

01af5a 

202c05 AA/* Compute checksums and prefix them to line */ 
f717f2 AAChecksumLine(fmt, lineData, strlen(lineData), line, NULL); 
52af5a 

9a30ff AAfprintf(state->out, "%c%c%s\n%s\f", НОВ PREFIX CHAR, 
776#6с AAAAfmt->headerTypeChar, line + 2, state->pageBuffer); 
bcaf5a 

e96897 AAstate->pageNumber++; 

140b03 AAstate->lineNumber = 0; 

92db52 AAstate->pageCRC = 0; 

423557 AAstate->pagePos = state->pageBuffer; ДА/х Clear page buffer х/ 
a48350 A} 

@fbced Areturn 0; 

07efe6 } 

baaf5a 

4638e5 /х 

2bb468 -* Search for Emacs "tab-width: " maker іп file. 

656ca4 -х Emacs is stricter about the format, but this will до. 
434954 :*/ 

d23be5 int FindTabWidth(MungeState *state) 

b@bb36 { 

283661 Achar const ж constAtabWidthMarker = " tab-width: "; 
649365 AcharAAAAbuffer[512]; 

ae643f Achar xAAAAp; 

451584 AintAAAAAlength; 

cbcdb5 AintAAAAAtabWidth = 0; 

50af5a 

c78ff8 Afseek(state->file, -(sizeof(buffer) - 1), SEEK END); 
fadaa5 Alength = fread(buffer, 1, sizeof(buffer) - 1, state->file); 
031325 Abuffer[length] = '\0'; 

8fb8a3 Ap = strstr(buffer, tabWidthMarker); 

c8990b Aif (p != NULL) 


4cd780 A( 
bda9c8 AAp += strlen(tabWidthMarker) ; 
615549 AAwhile (xp != '\0' && хр != “Ха” 88 isspace(*p)) 


5cfc32 АДДр++; 

9c6e6f AAtabWidth = strtol(p, 8р, 10); 

535549 AAwhile (хр != '\0' 88 xp != '“п' && isspace(*p)) 

09fc32 ДАДр++; 

c7fc6f AAif (хр != '\n' || tabWidth < 2) 

f8d263 AAAtabWidth - 0; 

дс4762 AAelse if (tabWidth > 16) 

91cc11 AAAfprintf(stderr, "WARNING: Weird tab-width (Xd), %s\n”, 
Ғс4289 AAAAAAAtabWidth, state-»fileName); 

7a8350 A} 

96900c Areturn tabWidth; 

f5efe6 } 

40af5a 

fe38e5 /х 

da@d91 -* Open the given source file and send the munged output to the 
280772 -x FILE х, with the given options. 

5d495d -х/ 

939c07 int MungeFile(char const *fileName, FILE xout, EncodeFormat const «fmt, 
d20cf8 AAA-int binaryMode, int defaultTabWidth, 

са145а AAA-long productNumber, long fileNumber) 

92bb36 ( 

dd6a7a AMungeState *Astate; 

875901 AintAAAAlength, used; 

ab2bcd AcharAAAline[PREFIX LENGTH + LINE LENGTH + 10]; 

a670bc Achar x*AAAlineData = line + PREFIX LENGTH; 

3ed75c AcharAAAbuffer[128]; 


567d8d AintAAAAresult = 0; 
27af5a 
e4793c Astate = (MungeState *)calloc(1, sizeof(*state)); 
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2d75ce Astate->fmt = fmt; 

90030a Astate->hFmt = &hexFormat; 

cbde34 Astate->origLineNumber = 1; 

c2edc3 Astate->fileName = fileName; 

9ef757 Astate->pageCRC = 0; 

62d043 Astate->productNumber = productNumber; 
a66b1c Astate->fileNumber = fileNumber; 

7dalf8 Astate->pageNumber = 0; 

405824 Astate->lineNumber = 0; 

17fc4b Astate-»fileOffset = 0; 

Ғ5480Ғ Astate->binaryMode = binaryMode; 

Заазађ Astate->pageBuffer = malloc(PAGE BUFFER SIZE); 
e9a223 Astate->pageBuffer[0] = '\0'; 

56470b Astate->pagePos = state->pageBuffer; 
са7592 Astate->hdrFlags = Q; 

b75093 Astate->out = out; 

16af5a 

4611ea Astate->fileNameTail = strrchr(state->fileName, '/'); 
8be296 Aif (state->fileNameTail == NULL) 

b4736b AAstate->fileNameTail = state->fileName; 
9cc252 Aelse 

2630e2 AAstate->fileNameTail++; 


26af5a 

04c404 Astate->file = fopen(state->fileName, binaryMode 2 "rb" : "r"); 
01ҒсаҒ Aif (state->file == NULL) 

054780 А( 


590583 AAresult = errno; 

547299 AAfprintf(stderr, "ERROR opening Xs: %s\n”, 
789831 AAAAstate->fileName, strerror(result)); 
3daf40 AAgoto error; 

628350 A} 

0367c5 A 

c3e6fa Aif (state->binaryMode) 

e5d780 A( 

67364b AAstate->tabWidth 
858350 A) 

2cc252 Aelse 

67d780 А( 

4d68fe AAstate->tabWidth = FindTabWidth(state); 

15da01 AAif (state->tabWidth == 0) 

a09e43 AAAstate->tabWidth = defaultTabWidth; 

ddf6ea AArewind(state->file); 

b58350 A) 

03af5a 

070e76 Awhile (!feof(state->file)) 

854780 At 

e855d4 AAif (state->binaryMode) 

0f0751 ДАГ 

асда65 AAAlength = fread(buffer, 1, BYTES PER LINE, state->file); 
6c9083 AAAif (length « 1) 

f6c085 AAA( 

d4ee6a AAAAif (feof(state->file)) 

16fab@ AAAAAbreak; 

с93с59 AAAAgoto fileError; 

9е9455 AAA} 

db4246 AAAif ((result = MaybePageBreak(state))) 

15b223 AAAAgoto error; 

e04c08 AAAif ((result - MungeBinaryLine(state, buffer, length, lineData))) 
836223 AAA Agoto error; 

6c69df AAAstate->fileOffset += length; 

705381 AA) 

8c48ba AAelse 


0; 


c10751 ДАГ 
afee4d AAAif (fgets(buffer, sizeof(buffer), state->file) == NULL) 
120085 AAA( 
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98ee6a AAAAif (feof (state->file)) 

05fabó AAAAAbreak; 

313с59 AAAAgoto fileError; 

209455 AAA} 

а59233 AAAlength = strlen(buffer) ; 

794246 AAAif ((result = MaybePageBreak(state))) 
52b223 AAAAgoto error; 

58285d AAAif ((result = MungeLine(state, buffer, length, lineData, &used))) 
2fb223 AAAAgoto error; 

Qeaf5a 

2a54a5 AAAif (used « length) 

32acfü AAAAif (fseek(state->file, used - length, SEEK CUR)) 
a653a4 AAAAAgoto fileError; 

d45381 AA} 

5daf5a 

1707ac AA/* Compute checksums and prefix them to the line х/ 
54с760 AAChecksumLine(fmt, lineData, strlen(lineData), line, &state->pageCRC) ; 
00af5a 

93993e AAstrcpy(state->pagePos, line); 

c04flb AAlength = strlen(state->pagePos) ; 
9351e5 AA/* Suppress trailing whitespace on blank lines х/ 
b184e0 AAif (length == PREFIX_LENGTH+1 8% state->pagePos[length-1] == '\n') { 
afdifd AAAstate->pagePos[--length-1] = "Мп"; 
b6e59c AAAstate->pagePos[length] = 7506; 

d95381 AA} 

af@ec4 AAstate->pagePos += length; 

eQaf5a 

d1004b AAstate->lineNumber++; 

4е8350 А) 

a2af5a 

2d01e4 Aif (state->lineNumber > 0) 

054780 At 

21а162 AA/x Force a final page break ж/ 

bf8642 AAstate->lineNumber = LINES. PER PAGE; 
d8a55f AAstate->hdrFlags |= HDR FLAG. LASTPAGE ; 
fb6de6 AAif ((result = MaybePageBreak(state))) 
36Ғ555 AAAgoto error; 

c88350 A) 

6faf5a 

5ca25c Aresult = 0; 

482532 Agoto done; 

dbaf5a 

387451 fileError: 

56198b Aresult = ferror(state->file); 

28af5a 

fd60ff error: 

8fb38c done: 

3c5a5a Aif (state != NULL) 

214789 А 

6e7d27 AAif (state->file != NULL) 

ea2cac AAAfclose(state->file) ; 

ecb580 AAfree(state); 

798350 A} 

c6f8a6 Areturn result; 

15efe6 ) 

8caf5a 

4Ғ9е06 int main(int argc, char xargv[]) 

32bb36 ( 

d9afb8 AintAAresult = 0; 

155546 AintAAi, j; 

590041 AintAAdefaultTabWidth = 4; 

fc4a24 AintAAbinaryMode = 0; 


e8c844 AlongAproductNumber = 1; 
З9сдса AlongAfileNumber = 1; 
65391с Achar *AendOfNumber ; 


--474b 


186ecb 
11af5a 
4319bb 
b@af5a 
b3f7dc 
17d780 
29cb48 
e50751 
5e1916 
685472 
925381 
090ac5 
d30751 
378085 
4ec085 
72d242 
fa5aa6 
4870d0 
63b0a2 
1a9455 
456f83 
03c085 
d67d8d 
f99455 
be923c 
e9c085 
с14952 
7b3b5e 
е55дер 
7ac260 
1e5b68 
d2043b 
107ab6 
f39455 
c69554 
46с085 
276325 
f944d1 
fb50eb 
702b21 
8c5b68 
74043b 
497ab6 
b79455 
a51d6f 
96c085 
52b287 
c244d1 
ed50eb 
З4е5с9 
a75b68 
c7043b 
1f7ab6 
729455 
f22376 
24c085 
f11b5c 
3b8bbQ 
829455 
035381 
c88350 
34991d 
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AEncodeFormat const *Afmt = NULL; 
AInitUtil(; 


Afor (i = 1; і < argc 88 argv[il[0] == '-'; i++) 
А 

AAif (0 == stremp(argvli], "--")) 

АА 

ДАА: 

AAAbreak; 

AA} 

AAfor (ў = 1; агемі 112115 900; j++) 
АА 

ДАЛІ" (isdigit(argvlillj1)) 

АЛА 

AAAAdefaultTabWidth = argv[i][jl - '@'; 


AAAAif (defaultTabWidth < 2 || defaultTabWidth > 9) 
AAAAAfprintf(stderr, "WARNING: Weird default tab-width (%d)\n”, 
AAAAAAAAAdefaultTabWidth); 

AAA} 

AAAelse if (argv[i][j] == 'b') 

АЛА 

AAAAbinaryMode = 1; 

AAA} 

AAAelse if (argvlilljl == 'F') 

АЛА 

ХАДА ит = FindFormat(argvLi]Lj*11); 


AAAAif (!fmt || агемГај[ј+2] != 7509 

АЛЛА 

AAAAAfprintf(stderr, "ERROR: Invalid format char\n"); 
AAAAAexit(1); 

AAAA} 

AAAAbreak; 

AAA} 

AAAelse if (argv[i][j] == 'p') 

АЛА 

AAAAproductNumber = strtol(&argv[i][j*1], &endOfNumber, 10); 
AAAAif (xendOfNumber != 7507) 

АЛЛА 

AAAAAfprintf(stderr, "ERROR: Invalid product питбеглп"); 
AAAAAexit(1); 

AAAA} 

AAAAbreak; 

AAA} 

AAAelse if (argvlilljl == 'f') 

АЛА 

AAAAfileNumber = strtol(&argv[i][j*1], &endOfNumber, 10); 
AAAAif (xendOfNumber != 'NQ') 

АЛЛА 

AAAAAfprintf(stderr, "ERROR: Invalid file питбег\п”); 
AAAAAexit(1); 

AAAA} 

AAAAbreak; 

AAA} 

AAAelse 

АЛА 

AAAAfprintf(stderr, "ERROR: Unrecognized option -%c\n”, агеу[і][51); 
AAAAexit(1); 

AAA} 


Aif (!fmt) 


ddd520 AAfmt = binaryMode 2 &radix64Format : &hexFormat; 
3aaf5a 
17c248 Afor (; i < argc; i++) 


--bdbc 


f2d780 
0e1b91 
a45a41 
4048bf 
e50751 
13f279 
70е39Ғ 
a82b52 
2dee06 
d85381 
e0e299 
ef8350 
7767с5 
d4bced 
b4efe6 
eaaf5a 
с938е5 
80e6c5 


c79b19 - 


2ee7a4 


08b612 - 
7a6c42 · 
66495d - 
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А 

AAif ((result = MungeFile(argv[i], stdout, fmt, binaryMode, 
AAAAAAAAdefaultTabWidth, productNumber , 
AAAAAAAAFileNumber)) != 0) 

АА 

AAA/* If result > 0, message should have already been printed х/ 
AAAif (result < 0) 

AAAAfprintf(stderr, "ERROR: %s\n", strerror(result)); 
AAAexit(1); 

AA} 

AAfileNumber++; 

A} 

A 

Areturn 0; 

2 


ж Local Variables: 
* tab-width: 4 

** End: 

* vi: ts-4 sw-4 

* vim: si 


--1а85 


bc38e5 
35aa56 


d3775e : 


97a5ec 
43775e 


affbf7 - 


84775e 


fad5ed - 


77775е 


954638 · 


7a59d7 
721d57 
87775e 
d67710 


9469c9 · 
e3495d · 


79af5a 
c738e5 


d5d88d · 


fc2fe0 
3024d3 


а5сда4 - 


1bd3d@ 


cd6162 : 


a2775e 
c16667 


783beb : 


7db8e0 


78ee29 · 


79fd6a 


да17е7 · 


е26163 


a092ed - 


9d775e 
0e26a5 


a75ab2 : 


b68f29 


79ffa2 - 
55495d · 


edaf5a 
c9d04a 
78490a 
5eaf5a 
8b00e0 
969389 
8eaf5a 
b538e5 
978430 
e65cb9 
29e44b 
де4954 
laaf5a 
e31ab5 
6Ғ83с1 
6с1Ғаа 
b930ae 
a3f360 
6de911 
defco6 
11cb95 
5959dc 
41c014 
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/ 

** subst.c -- Repair substitution tables 

* 

** Copyright (C) 1997 Pretty Good Privacy, Inc. 

эж 

ж Written by Colin Plumb 

Ж 

х $Id: subst.c,v 1.14 1997/11/03 22:12:00 colin Exp 5 

ў 

ж ІТ IS EXPECTED that users of this program will play with these tables 
"ж and the cost values in the subst.h header. · (Ѕоте day, they'll all 

-x get moved to an external config file.) 

е 

** NOTE: Other cost are hiding in the Filter functions in гераіг.с. 

ж Remember to keep them all on the same scale. 

ж/ 

[x 

x The repair program copies its input to its output, making various 

** substitutions, until it manages to produce a version that satisfies 
** the parser. -This includes having a correct CRC for each line. 

* Each substitution has a cost, and the combinations are tried in order 
ж of increasing cost. “МОТЕ that even translating "А"->"А" counts as 

* a substitution, although it may have zero cost. 

е 

** The intention is to correct transcription errors, where the 

ж errors have a distinctly non-uniform distribution. -Slight 

-x differences in cost produce a preference in trying some errors 

x first. “ІР an error costs half as much as another, combinations 

** of two of that error will be compared to one of the more expensive. 

* Too many cheap substitutions will result is repair spending 

** a very log time searching before considering the more expensive 

* substitutions. 

за 

** The following parameters and the raw substitution tables are expected 
* to be edited by the user based on experience. -Eventually, this 

** will be moved into an external config file, but for now it's а matter 
* of recompiling. 

x/ 

include "subst.h" 


include "util.h" 


/* what the OCR software reports for "unrecognizable */ 
#define UNRECOG STRING "~\274" 


/ж 

** The input substitutions to make (one-to-one). :-These аге listed in 
** the order of correction. i.e. uncorrected input first, then corrected 
** output. -Substitutions are one-way; to get two-way, list it twice. 
-х/ 

struct RawSubst const substSingles[] = { 


A/* Identity substitutions - note that period (.) is excluded х/ 

АС "!\"#$%&' ()х+,-./0123456789: ;<=>2" SPACE, STRING, 

A- "!\"#$%&' ()х,-./0123456789:;<->?" SPACE STRING, 0, 0, NULL 3, 

ДС "GABCDEFGHIJKLMNOPQRSTUVWXYZENN]^. Nt". ТАВ. STRING, 

Д. "@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]*_\t” ТАВ. STRING, 0, 0, NULL 3, 
А "‘abcdefghijklmnopgrstuvwxyz{ | УМЕ" FORMFEED. STRING, 

A-"* abcdefghijklmnopqrstuvwxyz{|}~\f" FORMFEED STRING, 0, 0, NULL }, 
#17 (TAB PAD CHAR 8 128) Д/х Not already included? х/ 

АС TAB. PAD. STRING, TAB. PAD. STRING, 0, NULL У, 


3c7454 #endif 
f9578b АС "\r\n” CONTIN_STRING, "\п\п” CONTIN STRING, 0, 0, NULL }, 
47af5a 


--d62e 


102f67 
37e383 
d8297d 
fb0839 
f6af5a 
38e834 
d7b9fb 
92de3d 
662d3d 
633767 
0749d5 
7b36f8 
47b90f 
3bcee8 
93bded 
ef51a8 
e8af5a 
4973ca 
8Ғ7733 
7даҒ5а 
b763f4 
39af5a 
ff3a81 
62d931 
078еа5 
Ғеа695 
80af5a 
abdd03 
80427с 
cf45c2 
013153 
506177 
falc8e 
8ed3f2 
6ee998 
12d9f5 
fd7454 
e9af5a 
61b6e9 
2982f7 
daaf5a 
с059с8 
дае242 
9р28с4 
bdcbcf 
980791 
згд да 
e3205f 
cb914f 
11a4d8 
6cda90 
444аде 
1846883 
46fala 
159d8e 
даа1са 
ba4a50 
e45e0b 
4faf5a 
ff78ab 
924c80 
4bed66 
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A/* Occasionally these just get inserted as glitches */ 
АС ".,'*", NULL, 5, 10, FilterNearBlanks ў, 

A/* This is now pretty infrequent */ 

АС "-_", "_-", 0, 10, FilterAfterRepeat }, 


А/ж 

Ax Capitalization errors аге common in some cases 

Ax c/C, s/S, u/U are fucked up all the time. 

Ax Also 0/0, v/V and w/W. -х, у and 2 also give some problems. 
Ax/ 

А "cilmopsuvwxyz", "CILMOPSUVWXYZ", 7, 13, FilterNearLower }, 
А "CILMOPSUVWXYZ", "cilmopsuvwxyz", 7, 13, FilterNearUpper }, 
A/* Other errors */ 

Д{ "g9aaiji;xX00Si", "9вр211;1%%003Ғ”, 10, 0, NULL }, 

A/* This seems to happen a lot */ 

АС "c", "г", 9, 0, NULL У, 


АС "ў", "5", 9, 0, NULL }, 
А ni us зит. 10, 0, NULL m 
А/ж Uncommon errors х/ 


A/* Wierd stuff that's happened in the checksum part х/ 
A/* А highish weight is okay here */ 
АС "SSEdJl", "554437", 15, 0, NULL $, 
Д{ "LESsPZ", "bb8a22", 15, 0, NULL }, 


A/x Wierd stuff that has happened х/ 

А "BasAeaeRoooo", "3334aQQQpqbd", 5, 15, FilterIsBinary }, 
Д{ "oooo", "рара”, 0, 15, FilterIsBinary }, 

АС "ttTCCflo”, "iff([lfG", 12, 0, NULL }, 


dif 0 

A/* If the line-breaks get screwed up, use these х/ 

At" ", "Nn", 10, COST INFINITY, FilterChecksumFollows }, 
АС "Nn", " ", COST INFINITY, 10, FilterChecksumFollows }, 
А "Nn", NULL, COST INFINITY , 11, FilterChecksumFollows }, 


#endif 


{ NULL, NULL, @, 0, NULL } 
J; 


/x The many-to-many substitutions */ 
struct RawSubst const substMultiples[] = { 


Ас opns caes 25270; NUECES Fy 

АС "``", үнө 2, 0, NULL У, 

А{ ",'", PV”, 2, ©, NULL }, 

At "',", "\"", 2, 0, NULL }, 

At ",,", "у", 2, 0, NULL }, 

A/* Extra inserted spaces are common */ 

A( "7", " ", COST INFINITY, :0, FilterFollowsSpace }, 

A( " ", "", 0, 15, FilterFollowsSpace 7, 

At "Nt", " ", COST INFINITY, :0, FilterFollowsSpace }, 

At "Nt", "", 0, 10, FilterFollowsSpace }, 

А/ж Convert between SPACE CHAR dots and periods */ 

АС ".", SPACE. STRING, 1, COST INFINITY, FilterFollowsSpace ), 

A£ ".", " "SPACE STRING, COST. INFINITY, 10, FilterFollowsSpace }, 
A{ SPACE STRING, ".", 15, 5, FilterFollowsSpace }, 

А SPACE STRING, " "SPACE STRING, COST INFINITY, 5, FilterFollowsSpace ), 
A/x Replace "unknown" by zero - it often is */ 


А UNRECOG STRING, "0", 1, 0, NULL }, 
А ОМВЕСОС STRING, "_", 2, 0, NULL }, 


384073 АС UNRECOG_STRING, ")", 3, 0, NULL 3, 
eblbec АС UNRECOG_STRING, "^", 4, 0, NULL 7, 
177cde A/* Except that these glitches are common х/ 


--8ela 


577138 
ba8f19 
baccfe 
5c7d53 
едас21 
352cdf 
4caf5a 
25арсд 
e78e5f 
аЗе0Ғ8 
f478af 
d62695 
a0121c 
4baf5a 
f9bbb3 
4557f1 
а9с540 
c80810 
a78bf6 
aec047 
ea75c7 
4b252f 
ddcf4e 
9779c3 
72af5a 
98271е 
4c84b1 
db2021 
a37269 
1ff0d2 
2d549b 
39d026 
90bb30 
dbc955 
3f6ac8 
9aaf5a 
73ca6d 
ca7a72 
a4dee2 
208caa 
e50e11 
d9aa58 
ed2ee5 
1e45f3 
b73796 
b7bf6d 
8daf5a 
86dd03 
9213e9 
5bf@5b 
15Ғ87Ғ 
99Ғ038 
18cele 
d9ae68 
807f3b 
8f7145 
85771f 
d9af5a 
1b7940 
3ad9b0 
е1Ғс35 
8Ға857 
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АС UNRECOG, STRING" ' ", 
АС UNRECOG, STRING" ' ", 
A{ "'"UNRECOG. STRING, "V"", 0, 0, NULL }, 
А UNRECOG, STRING UNRECOG, STRING , "V"", 0, 0, NULL }, 
A/* Something else that has been seen ж/ 


"уму, 0, 0, NULL У, 
"Ye" 1, 9, NULL у, 


АС "V'", "УМУ", 5, 0, NULL }, 
A/* А common transposition */ 
АС "\"'", "'\"", 5, 0, NULL }, 
АС "'\"", "\"'", 5, 0, NULL }, 
A/* These also happen fairly often */ 
АС "үл", "17", 5, 0, NULL 7, 
АС "NI ", 015 5, 0, NULL 7, 
A/* Common glitches */ 

АС "Nt. An", "Nn", 5, 0, NULL У, 
АС "Nt, Nn", "Nn", 5, 0, NULL У, 
АС "Nt-An", "Nn", 5, 0, NULL У, 
АС "Nt An", "Nn", 5, 0, NULL У, 
АС "NE'Nn", "Nn", 5, 0, NULL У, 
АС "NC NI", "Nn", 5, 0, NULL У, 
АС "\t~\n", "Nn", 5, 0, NULL У, 
АС "\t:\n", "Nn", 5, 0, NULL У, 


АС "\t"SPACE_STRING”\n”, "An", 5, 0, NULL }, 


ACCU An”, ОЛ 
АС An" Ant, 1 
А{ "-\п”, "\п”, 1 
N ЭЛ 
A MEO] 
AÇ” cAn”, eio] 
А{ mAn”, "\п”, 1 
Re? VN 1 


A/* Even less common х/ 


0, 
0, 
0, 
0, 
0, 
0, 
0, 
0, 


0, NULL }, 
д, NULL }, 
д, NULL }, 
д, NULL }, 
д, NULL }, 
д, NULL }, 
д, NULL }, 
д, NULL }, 
АС" "SPACE_STRING”\n”, "An", 10, 0, NULL }, 


АС ".\n", "An", 15, 0, 


AC "hs AS da 
AC "An^, "An", 15 
А{ "_\п”, "\п", 15 
и 15 
AL RARE "An", 15 
ACH "\n”, 15 
А{ ":\n”, "\п”, 15 


, 


, 


, 


, 


, 


, 


, 


д, 
д, 
д, 
д, 
д, 
д, 
д, 


NU 


LE: 35 


Д{ SPACE, STRING"An", "An", 15, 0, NULL }, 


А/ж Wierd stuff that 


At "17", "О", 10, 
A( "ll", "U", 10, 
АС "11", "U", 10, 
A( "il", "U", 10, 
A( "li", "U", 10, 
АС "1)", "U", 10, 
A( "L1", "U", 10, 
A( "LI", "U", 10, 
A( "L1", "U", 10, 


А “lo! "ы", 10, 
А "ol"; "des 10, 


© 


о © © © © © © © 


© 


0, 


has happened х/ 
NULL 
NULL 
NULL 
NULL 
NULL 
NULL 
NULL 
NULL 
NULL 


NULL 
NULL 
At "cliff", "diff", 2, 0, NULL }, 
AC "Nn", "*/\n", 10, 0, NULL }, 


}, 


}, 
}, 


fcaf5a 
160094 A/* That big black block has odd things happen to it х/ 
b67d2a A( "d", CONTIN_STRING, 10, 0, NULL }, 


--b6a9 


e8dbd7 
9d9ea7 
a@bcb1 
84af5a 
993Ғ30 
ер0955 
2Ғ1еа2 
603477 
41b57d 
701aa6 
аб7257 
dd9439 
eflea7 
c42573 
c5bb@c 
а96рда 
е7а71е 
30Ғ177 
130fe6 
c88a18 
65946c 
а8а141 
728еҒ7 
ас9261 
437454 
63рбе9 
0e82f7 
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А{ "d\n", CONTIN_STRING”\n”, 3, 0, NULL }, 
АС "S", CONTIN STRING, 10, 0, NULL }, 
АС "S\n", CONTIN_STRING”\n”, 3, 0, NULL }, 


A/* Tab-stop wonders х/ 


АС TAB.STRING, TAB. STRING"", 0, 0, TabFilter ), 

АС TAB.STRING, TAB. STRING" ", 0, 0, TabFilter У, 

АС TAB.STRING, TAB. STRING" -", 0, 0, TabFilter ), 
АС TAB.STRING, TAB. STRING" >>", 0, 0, TabFilter }, 
АС TAB.STRING, TAB. STRING" ..-”, 0, 0, TabFilter ), 
АС TAB.STRING, ТАВ STRING" ----", 0, 0, TabFilter ), 
АС TAB.STRING, TAB. STRING" сз ” Q, 0, TabFilter ), 
Д{ TAB.STRING, TAB. STRING" =+- ", 0, Q, TabFilter 7, 
A/* Some scan errors х/ 

A( "D ", TAB. STRING"", 1, 5, TabFilter 7, 

АС "D ", TAB. STRING" ", 1, 5, TabFilter У, 

АС "D ", TAB. STRING" >”, 1, 5, TabFilter ), 

АС "D ", TAB. STRING" >>”, 1, 5, TabFilter }, 

АС "D ", TAB. STRING" >>", 1, 5, TabFilter У, 

АС "D ", TAB. STRING" +++”, 1, 5, TabFilter ), 

АС "D ", TAB. STRING" +з ", 1, 5, TabFilter ), 

A( "D ", TAB. STRING" б: ", 1, 5, TabFilter 7, 
dif TAB PAD CHAR != ' ' 

#error Fix those tab patterns! 

#endif 

{ NULL, NULL, 8, 8, NULL } 


2; 


--b9e2 


bc38e5 
366010 


c4775e : 
Асађес : 
5е775е · 
dbbd99 · 
997659 · 
fd775e - 
11f10b - 


4b495d 
bbaf5a 
41d4f4 
5ef15e 
0a37bc 
a217c2 
3baf5a 
тдедез 
6baf5a 
cafeb2 
78495d 
ae324c 
14b1cb 
5fbea3 
Ofbb5f 
d7af5a 
f9490a 
дсаҒ5а 
bda399 
62bb36 
e24e84 
b19474 
e5c5bb 
bd32c9 
d5e4d3 
118Ғ76 
95р620 
0d3d36 
40e4ed 
1f1c28 
8c@b9a 
beedfa 
3e3dc6 
ecaf5a 
4daf5a 
d953a9 
6bbeb7 
ae3fb1 
91bb36 
77cd19 
5607c4 
01af5a 
077772 
949е7а 
100381 
78af5a 
c6feee 
965843 
£26295 
4faf5a 
f07c5b 
d82bee 
c2af5a 
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/ 


unmunge.c -- Program to convert a munged file to original form 
Copyright (C) 1997 Pretty Good Privacy, Inc. 


* 
ж 
* 
* 
* Designed by Colin Plumb, Mark H. Weaver, and Philip R. Zimmermann 
* Written by Mark H. Weaver 

* 

* $Id: unmunge.c,v 1.13 1997/11/13 23:27:08 mhw Exp $ 

-х/ 


#include <sys/stat.h> 
#include <sys/types.h> 
#include «fcntl.h» 
#include <unistd.h> 


/*x#include <direct.h> --teun: MS VC wants direct.h for mkdir х/ 


#include <stdio.h> 
#include <errno.h> 
#include <string.h> 
#include <ctype.h> 
#include <stdlib.h> 
#include <assert.h> 


#include "util.h" 


typedef struct UnMungeState 

{ 

Achar const *AmungedFileName; 
AcharAAAdirName[128]; 
AcharAAAfileName[128]; 

Achar *AAAfileNameTail; 
AintAAAAbinaryMode, tabWidth; 
AlongAAAproductNumber, fileNumber, pageNumber, lineNumber; 
AlongAAAmanifestLineNumber ; 
Aword16AAAhdrF lags; 
ACRCAAAApageCRC, seenPageCRC; 
AFILE xAAAmanifest; 

ДЕЦЕ xAAAfile; 

ДЕЦЕ xAAAout; 

} UnMungeState; 


/* Returns number of characters decoded, or -1 on error х/ 
static int 

Decode4(char const src[4], byte dest[3]) 

( 

AintAAi, length; 

AbyteAsrcVal[4]; 


Afor (i = 0; i < 4 8& src[i] != RADIX64 END CHAR; i++) 
AAif ((srcValli] = Radix64DigitValue(src[il)) == (byte) -1) 
AAAreturn 1; 


Alength = i - 1; 
Aif (length « 1) 
AAreturn -1; 


Afor (; i« 4; 1++) 
AAsrcVal[0] = 0; 


£00895 Adest[0] 
2f407e Adest[1] 
8111db Adest[2] 


(srcVal[0] << 2) | (srcVal[1] >> 4); 
(srcVal[1] << 4) | (srcVal[2] >> 2); 
(srcVal[2] << 6) | (srcVal[3]); 


--ce6a 


eQaf5a 
1f403b 
36efe6 
b@af5a 
6438e5 
fe53ad 
70495d 
19beb7 
99a565 
cdbb36 
925993 
0Ға566 
89af5a 
5423c6 
f3b971 
daaf5a 
1577d5 
523935 
3bb24d 
a22d7d 
3ab00f 
018350 
71fe26 
977683 
e66295 
f60134 
e7efe6 
26af5a 
f3ef99 
4dbb36 
f1bf78 
2deb4f 
008631 
4defe6 
80af5a 
d42367 
55657a 
2dbb36 
5659d8 
71b3a5 
68d9c7 
59bf@c 
cbaf5a 
3238bc 
bdd780 
26а090 
с00751 
990869 
890ab1 
c3751e 
ced64e 
d15381 
3b3ceb 
948350 
d7a884 
b7d780 
90c018 
208966 
a6dd@b 
даеа54 
210751 
7c87f9 
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Areturn length; 
2 


/ж 

-ж Return number of characters decoded, ог -1 on error 
«ж/ 

static int 

DecodeLine(char const xsrc, char xdest, int srclength) 
{ 

Aint destlength = 0; 

Aint result; 


Aif (srclength % 4 || !srclength) 
AAreturn -1;A/* Must be a multiple of 4 х/ 


Awhile (srclength -- 4) ( 

AAif (Decode4(src, dest + destlength) != 3) 
AAAreturn -1; 

AAsrc += 4; 

AAdestlength += 3; 

A) 

Aresult = Decode4(src, dest + destlength); 
Aif (result « 1) 

AAreturn -1; 

Areturn destlength + result; 


) 


int PrintFileError(UnMungeState xstate, char const xmessage) 
( 

Afprintf(stderr, "Xs, %s line %ld\n", message, 
AAAstate-»mungedFileName, state->lineNumber) ; 

Areturn 1; 


} 


int ReadManifest(UnMungeState xstate, long fileNumberWanted, 
AAAAchar const *fileTailPrefix, long prefixLen) 

{ 

AlongAAfileNumber = 0; 

AlongAAfirstMissingFileNum = 0, lastMissingFileNum = 0; 
AcharAAbuffer[512]; 


Achar *AAp 

Aif (state->manifest == NULL) 
А 

AAif (fileNumberWanted != 0) 
АА 


AAAassert(fileTailPrefix != NULL); 
AAAstrncpy(state->fileName, fileTailPrefix, sizeof(state->fileName)) ; 
AAAstate->fileName[sizeof(state->fileName) - 1] = '\@'; 
AAAstate->fileNameTail = state->fileName; 

AA} 

AAreturn 0; 

A} 

Awhile (fgets(buffer, sizeof(buffer), state->manifest)) 
А 

AAif ((р = strchr(buffer, 'Nn')) != NULL) 

AAAxp = '\0'; 

AAstate->manifestLineNumber++ ; 

AAif (buffer[0] == 'D') 

ААС 

AAAif (buffer[1] 12 ' ') 


329465 AAAAgoto invalidManifest; 
3e3a46 AAAstrncpy(state->dirName, buffer + 2, sizeof(state-»dirName)); 
0f28fc AAAif (state->dirName[sizeof(state->dirName) - 1] != "09 
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4394bb AAAAgoto invalidManifest; 


1c5381 AA) 

d448ba AAelse 

ba0751 AA( 

731d91 AAAfileNumber = strtol(buffer, &p, 10); 
467fa7 AAAif (р == buffer || хр != ' °) 


ab94b5 AAAAgoto invalidManifest; 

ebfc32 AAAp++; 

69af5a 

607496 AAAif (fileNumberWanted == 0 || fileNumber « fileNumberWanted) 
8ac085 АЛЛА 

322а40 AAAAif (firstMissingFileNum == 0) 

c2ddd2 AAAAAfirstMissingFileNum = fileNumber; 

ab63be AAAAlastMissingFileNum = fileNumber; 

3d1f34 AAAAcontinue; 

dd9455 AAA} 

04d58c AAAelse if (fileNumber > fileNumberWanted) 

9a7ab6 AAAAbreak; 

£02376 AAAelse 

е1с085 АЛА 

2a400d AAAAsize_tAAlen; 

dbaf5a 

f07d48 AAAAlen = strlen(state->dirName) ; 

b37de2 AAAAassert(sizeof(state->fileName) >= sizeof(state->dirName)); 
60800d AAAAmemcpy(state->fileName, state->dirName, len); 

7240ba AAAAstrncpy(state->fileName + len, р, 

165fe3 AAAAAAsizeof (state->fileName) - len); 

b45761 AAAAif (strncmp(p, fileTailPrefix, prefixLen) != 0) 
4f50eb AAAA( 

b3a926 AAAAAfprintf(stderr, "Mismatched filename, headers say 'Xs',Wn" 
c47b30 AAAAAAA" -manifest says '%s'\n", 

5cOecb AAAAAA^AfileTailPrefix, p); 

be894c AAAAAreturn 1; 

3c043b АЛАЛ} 

8394c5 AAAAp = state->dirName; 

ad2ef9 AAAAwhile ((p = strchr(p, '/')) != NULL) 

7е5дер AAAAL 

df9405 AAAAAxp = "0"; 

ec05b0 AAAAAnkdir(state->dirName, 0777); 

a3f8cb AAAAAxp++ = '/'; 

17043b AAAA} 

3bb4ad AAAAstate->fileNameTail = state->fileName + len; 

4c7ab6 AAAAbreak; 

5e9455 AAA} 

6a5381 AA} 

a88350 A} 

628c3b Aif (firstMissingFileNum != 0) 

424780 АС 

28192е AAfprintf(stderr, "Missing files %ld-%ld\n”, 

819626 AAAAfirstMissingFileNum, lastMissingFileNum) ; 

868350 A} 

70755b Aif (fileNumberWanted != @ && fileNumber != fileNumberWanted) 
4ed780 At 

5ee98a AAfprintf(stderr, "Can't find file %ld in manifest file\n”, 
cf049f AAAAfileNumberWanted); 

fb6637 AAreturn 1; 

b68350 A) 

48bced Areturn 0; 

d6af5a 

81b688 invalidManifest: 

2dde05 Afprintf(stderr, "Error parsing manifest file, line %ld\n”, 
e821c8 AAAstate->manifestLineNumber) ; 


43e631 Areturn 1; 
2cefe6 } 
b2af5a 
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24a9c7 int UnMungeFile(char const *mungedFileName, char const xmanifestFileName, 
e7f2cf AAAAint forceOverwrite, int forcePartialFiles) 

a5bb36 { 

57е88е AUnMungeState *Astate; 

c66ecb AEncodeFormat const *Afmt = NULL; 

330692 AcharAA Abuffer[512]; 

4c2097 AcharAA Aoutbuf [BYTES. PER LINE*1]; 

57dadc Achar xAAAline; 
e00786 Achar xAAAlineData; 
926fb7 Achar xAAAp; 

7с8с57 AintAAAAlength; 
5b7d8d AintAAAAresult = 0; 
47c4b7 AintAAAAskipPage = 
а5бада ACRCAAAAIineCRC; 
acf8ce Aword32AAAnun; 
@baf5a 

863dfe Astate = (UnMungeState *)calloc(1, sizeof(*state)); 

11ed3a Astate->mungedFileName = mungedFileName; 

деаҒ5а 

591464 Aif (manifestFileName != NULL) 

0ed780 At 

а64852 AAif ((state->manifest = fopen(manifestFileName, "г")) == NULL) 
d17570 AAAgoto errnoError; 

558350 A} 

90af5a 

c55df2 Aif ((state->file = fopen(state-»mungedFileName, "r")) == NULL) 
08eedc AAgoto errnoError; 


0; 


98af5a 

даде76 Awhile (!feof(state->file)) 

aad780 А 

5566da AAif (fgets(buffer, sizeof(buffer), state->file) == NULL) 
770751 ДА 


6ed059 AAAif (feof(state->file)) 

aa7ab6 AAAAbreak; 

13a7f5 AAAgoto fileError; 

475381 AA) 

2aaf5a 

a6004b AAstate->lineNumber++; 

e3af5a 

a90e59 AAline = buffer; 

77а3р0 AA/* Strip leading whitespace */ 
e645b5 AAwhile (isspace(*line)) 

7330b0 AAAline++; 

88b17d AAif (xline == '\0') 

39cfec AAAcontinue; 

67af5a 

a4a39a AA/* Strip trailing whitespace */ 
dd@bce AAp = line + strlen(line); 

abe875 AAwhile (p > line && (byte)p[-1] < 128 && isspace(p[-1])) 
716171 AAAp--; 

97af5a 

291а23 AAlineData = line + PREFIX_LENGTH; 
e2af5a 

5d9d89 AA/* Pad up to at least PREFIX LENGTH */ 
b52d5e AAwhile (p « lineData) 

f31bb2 AAAxp++ = ' '; 

e07cf5 AAxp++ = "Мп"; 

5d59be AAxp = '\Q'; 

18fe65 AAlength = p - lineData; 

d2af5a 

2c0828 AAif (line[0] == HDR_PREFIX_CHAR) 
f00751 AA( 


418432 AAAfmt = FindFormat(line[1]); 
587cad AAAif (! fmt) 
6fc085 АЛА 
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bd3af6 
e5b223 
b39455 
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e7af5a 
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81d246 
fdf63e 
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76d505 
25f63e 
1b5e66 
2daf5a 
b9606b 
67f63e 
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AAAAresult = PrintFileError(state, "ERROR: Invalid header type"); 
AAAAgoto error; 

AAA) 

AA) 


AAlineCRC = CalculateCRC(fmt-»lineCRC, 0, (byte const *)lineData, length); 


AAp - line * EncodedLength(fmt, fmt-»runningCRCBits); 

AAif (DecodeCheckDigits(fmt, p, NULL, fmt->lineCRC->bits, &num) 
ЛАЛА | | lineCRC != num) 

АА 

AAAresult = PrintFileError(state, "ERROR: Line CRC failed"); 
AAAgoto error; 

AA) 


AAif (line[0] == НОВ PREFIX CHAR) 
АА 

AAAintAAAformatVersion; 
AAAintAAAflags; 
AAACRCAAAseenPageCRC; 
AAAintAAAtabWidth; 
AAAlongAAproductNumber ; 
AAAlongAAfileNumber ; 
AAAlongAApageNumber ; 

AAAchar *AAfileNameTail; 
AAAintAAAskipNextPage = 0; 
AAAchar *AAp; 

AAAEncodeFormat const *AhFmt = &hexFormat; 


AAA/* Parse header line */ 
AAAp = lineData; 


AAAif (DecodeCheckDigits(hFmt, р, ёр, НОК VERSION BITS, &num)) 
АЛА 

AAAinvalidHeader: 

AAAAresult = PrintFileError(state, "ERROR: Invalid header”); 
AAAAgoto error; 

AAA} 

AAAformatVersion = num; 


AAAif (DecodeCheckDigits(hFmt, p, &p, HDR FLAG BITS, &num)) 
AAAAgoto invalidHeader; 
AAAflags - num; 


AAAif (DecodeCheckDigits(hFmt, p, ӛр, fmt->pageCRC->bits, &num)) 
AAAAgoto invalidHeader; 
AAAseenPageCRC = num; 


AAAif (DecodeCheckDigits(hFmt, p, &p, HDR TABWIDTH BITS, &num)) 
AAAAgoto invalidHeader; 
AAAtabWidth = num; 


AAAif (DecodeCheckDigits(hFmt, р, &p, HDR PRODNUM BITS, &num)) 
AAAAgoto invalidHeader; 
AAAproductNumber - num; 


AAAif (DecodeCheckDigits(hFmt, p, &p, HDR FILENUM BITS, &num)) 
AAAAgoto invalidHeader; 
AAAfileNumber = num; 


AAAif (sscanf(p, " Page %ld of ", &pageNumber) < 1) 
AAAAgoto invalidHeader; 


Q5af5a 
b3dfef AAAif (formatVersion > 0) 
5fc085 АЛА 
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e07ead AAAAresult = PrintFileError(state, 

5fda97 AAAAAAAAAA"ERROR: Format too new for " 

17c6f2 AAAAAAAAAAA "this version of unmunge”); 

b8b223 AAAAgoto error; 

bd9455 AAA} 

5aaf5a 

562691 AAAp = strstr(p, " of "); 

1ea2e9 AAAif (p == NULL) 

4ef63e AAAAgoto invalidHeader; 

feaf5a 

38ff23 AAAFfileNameTail = р + 4; 

16aabb AAAp = fileNameTail + strlen(fileNameTail); 

53dd71 AAAif (р < fileNameTail + 3 || p[-1] != '“п') 
def63e AAAAgoto invalidHeader; 

7с2376 AAAelse 

a6c429 AAAAp[-1] = 790%; 

Qaaf5a 

65cd67 AAAif (state->out != NULL 8% state->pageCRC != state->seenPageCRC) 
cec085 АЛЛА 

9d7ead AAAAresult = PrintFileError(state, 

6ccf3f AAAAAAAA"ERROR: Page CRC mismatch on page before"); 
370223 AAAAgoto error; 

e69455 AAA} 

f5af5a 

973аде AAAif ((state->hdrFlags & HDR FLAG LASTPAGE) 88; state->out != NULL) 
8сс085 АЛЛА 

59b6a5 AAA Afclose(state-»out); 

22f636 AAAAstate->out = NULL; 

829455 AAA) 

6baf5a 

1c1567 AAAif (state-»out !- NULL) 

0bc085 AAA( 

р298ад AAAAif (pageNumber != state->pageNumber + 1 || 
79cd84 AAAAAAfileNumber != state->fileNumber || 

еде1ед AAAAAAproductNumber != state->productNumber || 
ac63e5 AAAAAAtabWidth != state->tabWidth || 

b306e5 AAAAAAstrcmp(fileNameTail, state->fileNameTail) != 0) 
8с50ер АЛЛА 

432347 AAAAAif (fileNumber == state->fileNumber 88 

558960 AAAAAAApageNumber > state->pageNumber + 1) 

964а93 AAAAA( 

1cd43b AAAAAA(void)PrintFileError(state, 

d5db8c AAAAAAAAA"ERROR: Missing pages of this file”); 
1b92c7 AAAAAAif (forcePartialFiles 8% !state->binaryMode) 
34b1cf AAAAAAL 

968be2 AAAAAAAfputs("NnNneeeeee Missing pages here! @@@@@@\n\n", 
5a2e8c AAAAAAAA:state-»out) ; 

e5e51f AAAAAA) 

1abf66 AAAAAAelse 

78b1cf AAAAAA( 

349263 AAAAAAAskipNextPage = 1; 

3af377 AAAAAAAfclose(state->out) ; 

fa1801 AAAAAAAstate->out = NULL; 

45ae62 AAAAAAAremove(state->fileName) ; 

99е518 AAAAAA) 

7Ғ8еіз AAAAA} 

5491a2 AAAAAelse 

7bda03 AAAAA( 

014435 AAAAAA (void)PrintFileError (state, 

3806ce AAAAAAAAA"ERROR: Missing pages of previous file”); 
1192c7 AAAAAAif (forcePartialFiles && !state->binaryMode) 
9dbicf AAAAAAL 


658be2 AAAAAAAfputs("\n\n@@@@@@ Missing pages here! @@@@@@\n\n", 
a32e8c AAAAAAAA · state->out); 
27e593 AAAAAAA/* Make it non-fatal, though... */ 
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fcf377 AAAAAAAfclose(state->out) ; 

8f1801 AAAAAAAstate->out = NULL; 

5ce51f AAAAAA} 

a3bf66 AAAAAAelse 

97b1cf AAAAAA( 

34f377 AAAAAAAfclose(state->out) ; 

di1801 AAAAAAAstate->out = NULL; 

7dae62 AAAAAAAremove(state->fileName) ; 

fee51f AAAAAA) 

b28ed3 AAAAA} 

6b043b АДАД) 

699455 AAA) 

bbaadc AAAif (state-»out -- NULL) 

4ac085 AAA( 

478Ғ39 AAAAif (pageNumber != 1 8% !skipPage) 

c0546d AAAAA(void)PrintFileError(state, 

133aad AAAAAAA"ERROR: File doesn't begin with page 1"); 
73af5a 

42Ғ999 AAAAstate->binaryMode = (tabWidth == 0); 
flaf5a 

ba1390 AAAAif (pageNumber != 1 8% (state->binaryMode 
26ad63 AAAAAAAAAA| | !forcePartialFiles)) 

cd50eb АЛЛА 

3eb49b AAAAAskipNextPage = 1; 

4e043b АДАД) 

7b2f7d AAAAelse 

2e50eb ДАДА 

844c58 AAAAA/* TODO: Use global filelist to get pathname ж/ 
a5d7b4 AAAAAresult = ReadManifest(state, fileNumber, fileNameTail, 
aad370 AAAAAAAAAA : strlen(fileNameTail)); 

3дес15 AAAAAif (result != 0) 

25367а AAAAAAgoto error; 

02аҒ5а 

213766 AAAAAif (!forceOverwrite) 

85da03 AAAAAL 

edidcf АДАДАДЕЦЕ *Afile; 

33af5a 

b1038f ДАДАДА/х Make sure file doesn't already exist */ 
f2870f AAAAAAfile = fopen(state->fileName, "r"); 
b91047 AAAAAAif (file != NULL) 

f7bicf AAAAAAL 

28fe02 AAAAAAAfclose(file); 

dbe457 AAAAAAAfprintf(stderr, "ERROR: Xs already exists\n”, 
438934 AAAAAAAAAstate->fileName) ; 

09ecód AAAAAAA result = 1; 

5c6b15 AAAAAAAgoto error; 

d9e51f AAAAAA} 

248ed3 AAAAA} 

61af5a 

479393 AAAAAstate->out = fopen(state->fileName, 
7с8а50 AAAAAAAAA: -state->binaryMode 2 "wb" : "w"); 
8седса AAAAAif (state->out == NULL) 

7a6109 AAAAAAgoto errnoError; 

e5af5a 

b01ac3 AAAAAif (pageNumber != 1) 

0074ае AAAAAAFputs(”"\n\n@@@ee@ Missing pages here! ааааааћпхп", 
74799 AAAAAAA: state->out) ; 

21043b AAAA) 

c49455 AAA) 

35af5a 

dc8cd3 AAAstate->pageCRC = 0; 

d8402f AAAstate->seenPageCRC = seenPageCRC; 


caf341 AAAstate->hdrFlags = (word16)flags; 
3с0768 AAAstate->pageNumber = pageNumber; 
5adc79 AAAstate->fileNumber = fileNumber; 
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0005d412e5c40010001 Page 8 of unmunge.c 


AAAstate->productNumber = productNumber; 


AAAstate->tabWidth = tabWidth; 

AAAskipPage = skipNextPage; 

AA) 

AAelse if (!skipPage) 

АА 

AAAif (state->out == NULL) 

АЛА 

AAAAresult = PrintFileError(state, "ERROR: Missing header line"); 
AAAAgoto error; 

AAA} 


AAA/* Normal data line х/ 

AAAstate->pageCRC = CalculateCRC(fmt->pageCRC, state->pageCRC, 
AAAAAAAAAAA: · (byte const *)lineData, 

AAAAAAAAAAA : length); 

ДАЛІіпер[ = 706; 

AAAif (DecodeCheckDigits(fmt, line, NULL, fmt->runningCRCBits, &num) 
АЛАЛА || RunningCRCFromPageCRC(fmt, state->pageCRC) != num) 
АЛА 

AAAAresult = PrintFileError(state, "ERROR: Running CRC failed”); 
AAAAgoto error; 

AAA) 


AAAif (state-»binaryMode) 

АЛА 

AAAAlength = DecodeLine(lineData, outbuf, length-1); 
AAAAif (length « 0 || length > BYTES PER LINE) 4 
AAAAAresult = PrintFileError(state, 
AAAAAAAAA"ERROR: Corrupt radix-64 data”); 
AAAAAgoto error; 

AAAA} 

AAAAfwrite(outbuf, 1, length, state->out); 
AAA) 

AAAelse 

АЛА 

AAAAp = lineData; 

AAAAwhile (хр != '\0') 


АЛЛА 

AAAAAif (кр == TAB. CHAR) 
ААААА( 

АДАДАДр++; 
AAAAAAputc('Nt', state->out); 


AAAAAA/*while ((p - lineData) % state->tabWidth) 
ААААААС 

AAAAAAAif (xp == '“п') 

AAAAAAAAbreak; 

AAAAAAAelse if (хр == ' ') 
AAAAAAAAp+; 

AAAAAAAelse 

AAAAAAA( 

AAAAAAAAresult = PrintFileError(state, 
AAAAAAAAAAAA"ERROR: Not enough spaces " 
AANAAAAAAAAAA"after a tab character"); 
AAAAAAAAgoto error; 

AAAAAAA) 

AAAAAA}«/ 

AAAAA} 

AAAAAelse if (xp == FORMFEED_CHAR) 
AAAAAL 

АДАДАДр++; 

AAAAAAif (xp != '“п') 


7bblcf AAAAAAL 
163614 AAAAAAAresult = PrintFileError(state, 
5dc610 AAAAAAAAAAA"ERROR: Formfeed character " 
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1faf5a 
a260ff 
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AAAAAAAAAAA"not at end of line"); 
AAAAAAAgoto error; 

AAAAAA} 

AAAAAAp++; A/* Skip newline */ 
AAAAAAputc('\f', state->out); 
AAAAA} 

AAAAAelse if (хр == CONTIN_CHAR) 
AAAAAL 

AAAAAAp p++; 

AAAAAAif (xp != '“п') 

ААААААС 

AAAAAAAresult = PrintFileError(state, 
AAAAAAAAAAA"ERROR: Continuation character " 
AAAAAAAAAAA"not at end of line"); 
AAAAAAAgoto error; 

AAAAAA) 

AAAAAAp++; A/* Skip newline */ 
AAAAA} 

AAAAAelse if (хр == SPACE CHAR) 
AAAAAL 

AAAAAAputc(' ', state->out); 
АДАДАДр++; 

AAAAA} 

AAAAAelse 

AAAAAL 

AAAAAAputc(*p, state->out); 
AAAAAAp p++; 

AAAAA} 


Aif (state->out != NULL) 

At 

AAif (!(state->hdrFlags & HDR_FLAG_LASTPAGE)) 

АА 

AAAresult = PrintFileError(state, "ERROR: Missing pages"); 
AAAgoto error; 

AA) 

AAif (state->pageCRC != state->seenPageCRC) 

АА 

AAAresult = PrintFileError(state, 

AAAAAAA"ERROR: Page CRC failed on previous page"); 
AAAgoto error; 

AA) 

A) 


A/* Check for missing files at the end */ 
Aresult - ReadManifest(state, 0, NULL, 0); 
Agoto done; 


errnoError: 
Aresult - errno; 


Agoto printError; 


fileError: 
Aresult = ferror(state->file); 


printError: 
Afprintf(stderr, "ERROR: %s\n", strerror(result)); 


error: 


a6b38c done: 
c45a5a Aif (state != NULL) 
404780 At 
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ea2b54 AAif (state->out != NULL) 

582373 AAAfclose(state->out) ; 

597d27 AAif (state->file != NULL) 

b62cac AAAfclose(state->file); 

d1788d AAif (state->manifest != NULL) 

250143 AAAfclose(state->manifest) ; 

50b580 AAfree(state); 

228350 A} 

caf8a6 Areturn result; 

36efe6 ) 

baaf5a 

781926 void UsageAndExit(int result) 

2ebb36 { 

9b43bc Afprintf(stderr, 

992988 AAA"Usage: unmunge [-fp] «file» [<manifest>]\n" 
57040a AAA" --f -Force overwrites of existing files\n” 
cf1a54 AAA" ·-р ·Рогсе unmunge of partial Ғі1еѕ\п”); 
bf631a Aexit(result); 

24efe6 ) 

73af5a 

919e06 int main(int argc, char xargv[]) 

f7bb36 ( 

5aafb8 AintAAresult = 0; 

540139 AintAAforceOverwrite = 0; 

7a4098 AintAAforcePartialFiles = Q; 

c58f01 Achar xAfileName = NULL; 

701с0Ғ Achar xAmanifestFileName = NULL; 
Q@bbdf AintAAi, j; 

e2af5a 

d819bb AInitUtilQ; 

bfaf5a 

70f7dc Afor (і = 1; і < argc 88 argv[il[0] == '-'; i++) 
534780 А 

Засбав AAif (0 == strcmp(argv[i], "--")) 
db0751 AA( 

cc1916 ДАД1++; 

c25472 AA Abreak ; 

fb5381 AA} 

8сдас5 AAfor (і = 1; argv[i][j] != '\0'; j++) 
eb0751 AA( 

7с4857 AAAif (argv[i]Lj] == 'h') 

75873f AAAAUsageAndExi (0); 

b11d6f AAAelse if (агву(11111 == 'f') 

0745с5 AAAAforceOverwrite = 1; 

се9554 AAAelse if (агеу[і][5] == 'p') 

9349c5 AAAAforcePartialFiles = 1; 

672376 AAAelse 

с5с085 АЛЛА 

b11b5c AAAAfprintf(stderr, "ERROR: Unrecognized option -%c\n”, argv[i]Lj]); 
189b84 AAAAUsageAndExit(1); 

469455 AAA) 

835381 AA) 

298350 A} 

8aaf5a 

59962f Aif (i < argc) 

648с24 AAfileName = argv[i-**]; 

de962f Aif (i < argc) 

75Ғ262 AAmanifestFileName = argv[it*]; 

aab454 Aif (fileName == NULL || i < argc) 
876fd5 AAUsageAndExit(1); 

9daf5a 

3439b1 Aif ((result - UnMungeFile(fileName, manifestFileName, 


a73854 AAAAAAA-forceOverwrite, forcePartialFiles)) !- 0) 
c0d780 ДГ 
f179cf AA/* If result > 0, message should have already been printed х/ 
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дсебс5 
d89b19 
59e7a4 
91b612 
7e6c42 
73495d 
f2af5a 
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AAif (result < 0) 


AAAfprintf(stderr, "ERROR: 


AAexit(1); 
A} 


Areturn ®; 
} 


/* 

** Local Variables: 
** tab-width: 4 

** End: 

ж Vi: ts-4 sw-4 

х vim: si 


.ж/ 


%s\n", strerror(result)); 
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52951b BIN -= bin 

40с599 OBJ -- obj 

6cdce2 DIRS = $(BIN) %(0В7) 
c8a897 CFLAGS = -g -0 -W -Wall 


96af5a 

75a0fe PERL ::.;;... = $(addprefix $(BIN)/, bootstrap bootstrap2 makemanifest psgen sortpages уарр) 
38f2f3 BINS ++: = $(addprefix $(BIN)/, unmunge munge repair) 

9f82bb UNMUNGE_OBJS = $(addprefix $(0BJ)/, util.o unmunge.o) 

869df2 MUNGE OBJS --= $(addprefix $(0BJ)/, util.o munge.o) 


82c566 REPAIR OBJS ·= $(addprefix $(0BJ)/, util.o heap.o mempool.o subst.o repair.o) 
91af5a 

76275b $(shell mkdir -p $(DIRS)) 

62af5a 

0662ed all: $(BINS) $(PERL) 

c6af5a 

068c53 $(BIN)/%: %.pl 

b649d2 Acp $< $@; chmod +x $@ 

07аҒ5а 

9cfb7a $(BIN)/unmunge: $(UNMUNGE, 0BJS) 
61aea5 A$(CC) $(CFLAGS) -o $@ $(UNMUNGE_OBJS) 
8baf5a 

cd7aea $(BIN)/munge: $(MUNGE 0BJS) 

81bea2 A$(CC) $(CFLAGS) -o $8 $(MUNGE. 0BJS) 
53af5a 

9a693d $(BIN)/repair: $(REPAIR_OBJS) 

8d636a A$(CC) $(CFLAGS) -o $8 $(REPAIR_OBJS) 
1daf5a 

cfbc06 $(0BJ)/*.0: %.с 

7826ae А%(СС) $(CFLAGS) -c -o $0 $< 

e7af5a 

cb49c8 clean: 

966f95 Arm -f $(0BJ)/* $(BIN)/*.core 

05af5a 

4aled2 cleaner: 

bd60b3 Arm -rf $(DIRS) 


